Search This Blog

Simplicity is the ultimate sophistication.” — Leonardo da Vinci
Contact me:

Saturday 24 January 2015

Trend analysis

Trend analysis
 Our human brain learns by pattern recognition or trend analysis. A pattern is a set of rules or a set that repeat itself on all occurrences. Let's say that we study the occurrence of an event which is random in nature, with each outcome baring specific probability weight. This probability weight is supposed to be our "pattern" or "repeating rule" in the trend that is observed. A random experiment (of observing the occurrence of a likely event among the total number of occurrences) would be enough to provide the required pattern in most cases where the data in our hand is limited to only the occurrence of the event. But if we extend our available resources to many other variable parameters that change in accordance with the change in the outcome of the event being studied,then we may extend the probability density function from a leaner constant to a relation involving the new parameter.

A general probability equation is just a constant value ($C_x$) which gives the occurrence of the incident $x$ given that $x\in U$ where, $U$ is the set containing all the possible occurrences. Even subjects like artificial intelligence depend upon pattern matching or trend analysis for the evaluation. Probabilistic analysis is at the heart of all discoveries; The above defined constant $C_x$ can be extended to an expression which involves more than one parameters, i.e. let there be a probabilistic relation $P(x, p_1, p_2,...,p_n)$ which includes the parameter $C_x$ but involves in forming a $\mathbb{R} \to \mathbb{R}$ relation. In this article I am explicitly going to prove that any differentiable and continues function defined in the real numbers that's used to predict an attribute of a system can be said to be a leaner polynomial relation of probability density function that expresses the probability of a specific attribute of that system. If we define the function to be continues and differentiable in the region of real numbers, then the function $f(x)$ can be defined as:
$f(x) = ap(x)+b, a,b \in \mathbb{R}$. We know that $p(x)$ takes values in the region $[0,1]$ which is a small subset of the set of real numbers, but on setting an infinitely large value for $a$ and define the function $p(x)$ such that its values are infinitesimally small compared to $a$, we can obtain any real relation we would like to form (including periodic, non-periodic, exponential relations). But that's not enough; we haven't yet defined the actual attribute expressed by a probabilistic function defined like that. Let there be a machine $S$ defined such that, each time it gets an input, it increments the weight field associated with that input (assuming that each input field is associated with one weight field). Now assume that a hypothetical user imputed the data set ( giving the more favored values as inputs and the less favored inputs for a limited number of times; carrying over this process for infinite times, thereby inputting a value $x\in\mathbb{R}$ at least once) trains the system to respond specifically for the inputs gained.The weight values are assumed to be the output values.

So, the function which maps the inputs with its associated weight values forms the relative probabilistic distribution of the likelihood of a particular value in $\mathbb{R}$ to be imputed at a randomly chosen time instant during the training process. This demands that each input to be included in the function is to be inputted for a finite number of times. For any function $f(x)$, $p(x) =\lim_{a\to\infty} 

\frac{(f(x) - c)}{a}$. Also, the other definition that can be added to this is,
$\int_{0}^{\infty}p(x) dx = 1$. This clearly shows that $p(x)$ depends on the number of times a particular variable gets inputted to the system through the infinite number of attempts. We may use the above analysis methods in commonly accepted and well known trend analysis algorithms used by computers (and human) in analysing observable trends in a changing system, which changes w.r.t a specific parameter, that is generally time. It is clear that, a trend analysis has variable parameters and one or more constant parameters, which are generally the rules that are employed in extracting specific parameters from the observable system.

Analysis based on series of recorded numbers
Let’s consider that a series of numbers are inputted to the analysing system. The system’s motive is to find the appropriate number that satisfies the recorded pattern. One way to correlate a set of $n$ abstract numbers is, by relating them to a general polynomial expression of order $n$ with unknown coefficients, substituting each value in the arithmetically progressive series $k_1, k_2, …, k_n$ and solving all the obtained $n$ equations to arrive at unique values for each of the unknown coefficients. But this method is straight forward and of little value to us on a large scale and so necessity of more rigorous methods to correlate the observable pattern arises. Because our above method shows that a set of $n$ numbers can be related in infinite possible ways, this generalises the above method of pattern matching. So we cannot exactly define what kind of pattern should be looked for.

So there are numerous kinds of patterns we can observe from a set of numbers. Let’s say that we wish to observe the pattern based on the rate of change of the value w.r.t. its preceding value. So if we have a series $k_1, k_2, …, k_n$ then we will have another difference series $\Delta k_1, \Delta k_2, … \Delta k_{n-1}$. We can say that, $\Delta k_1 = k_2 – k_1$ and $\Delta k_2 = k_3 – k_2$ and so on. If the values entered to the system is actually a polynomial series, then the exact succeeding value can be predicted by this “difference method”. From the series $S$ defined as $S = k_1, k_2, …, k_n$ we can define another series $S’$ as, $S’ = \Delta k_1, \Delta k_2, …, \Delta k_{n-1} $.

Observing closely, we can say that the size of the series $S’$ is one less than the size of the series $S$. This goes on, as we keep using the difference method in obtaining $S’’, S’’’, … $ (i.e., $S’’$ is defined as $(S’)’$). At one point, the series becomes unity (with only one term in it). We may represent that final term as $S^n$ (for simplicity, we express $S’’$ as $S^2$ and so on). It’s easy to show that a series generated by a polynomial of degree $n$ by substituting integral values in increasing order, ends with exactly one term after $n$ difference operations. Let the series $S$ represent $k$ terms that are generated by a polynomial $f$ of degree $n$ ($k>n$). So applying the difference operation after choosing $n$ terms consecutively from the $k$ terms and on applying the difference operation successively for $n$ times we get $S^n$ which contains just one term.

This trend analysis will again be of practical use only if the given series is defined to be generated by a polynomial, and existence of terms that are wrong in the series causes it to deviate completely from the series defined by the polynomial. Efficiency of algorithms in solving these kind of trend analysis problems greatly depends upon the influence of “noise” input or input that does not fit into the actual trend that’s being analysed. Many efficient methods ignore such noise values to a great precision.           

copyright © 2015 K Sreram, all rights reserved.

No comments:

Post a Comment

Featured post

Why increasing complexity is not good?

“ Simplicity is the ultimate sophistication.” — Leonardo da Vinci Why is complicating things wrong ? - K Sr...