Helpful Information
 
 
Category: Software Design
a mathematical language/function decyphering primer?

does anyone know where i can find out how to read mathematical function/equations? some kind of mathematical symbol/language primer?

you know the usual kind of equations that involve those greek weird bent E's and various other symbols - i get the feeling they're not as complicated as they look and just some kind of key/explenation is needed.

like in a c book i was following it shows this mathematical function:

f n = f n-1 + f n-2

and in c that's


next_number = current_number + old_number; !

(the exclemation mark isn't part of the code - it's resembling my surprise at how simple the c version is in comparison to the mathematical version)

anyone know of any good text that'll help me get from weird complicated looking mathematical functions to c code that i can (kind of) understand, preferably on the net? any ideas/pointers much appreciated. thanks.

If you have a local college you may try browsing their bookstore for introductory proofs books. These will usually be slimmer than, say a calculus book, but contain most of the symbols and notation used throughout abstract mathematics.

Good luck!

ok, thanks.

i'm trying to work this out - it's the poisson distribution formula :

If the average number of random occurrences per interval = m, the probability P of a occurrences in the interval is:

P(a) = e -m[ma/a!]

(the ma bit - the a should be above to the right of m, so it's to the power of)

(formula from this page http://info.bio.cmu.edu/Courses/03438/PBC97Poisson/PoissonPage.html)

if m=5 and a=3 :

ma is 5 X 5 X 5 = 125

3! is 6

so 125 / 6 = 20.8333...

20.8333 X -3 = -62.5

P when a is 3, equals -62.5? is that right?

(the e in the equation - is that representing the sum of - usually that bent E greek symbol: is that what that e is representing?)

i've got the answer to this now.

i was wrong: e is not representing the sum of. that part means e to the power -m, or e^-m. (e, the base of natural logarithms: 2.71828...). and then that result needs multiplying with the result from in the square brackets.

For future reference:
The "weird greek bent e symbol meaning sum of" is actually a SIGMA, the Greek capital letter S!

For future reference:
The "weird greek bent e symbol meaning sum of" is actually a SIGMA, the Greek capital letter S!

Originally posted by balance

P when a is 3, equals -62.5? is that right?


It's not right obviously, as the formula is to calculate a probablity, and these lie in [0..1]...

the actual answer is probably about 0.14 or so...

As for the getting from formulae to C code, I don't know... I would say it's more important to learn what the formulae mean and do the conversion yourself, and there are probably endless basis math tutorials on the web...

good luck, and have fun! ;) :)

yeah i've got this now. e is 2.7182... the natural log number - not the bent E, sigma, sum of all, at all - i was stabbing in the dark there obviously.

c code in case anyone's interested:


#include <stdio.h>
#include <math.h>

/* avarage occurances: average number of random occurrences per interval: */
#define AV_OCS 5.5

main()
{
double fact(int a);

double P; /* the probability of occurrence in the interval */
int a; /* the interval number */

printf("\nm = %g (average number of random occurrences per interval)\n\
Probability of occurrences in intervals :\n\
------------------------------------------------------------------\n", AV_OCS);
for(a = 0; a < 20; a++) {
P = exp(-AV_OCS) * pow(AV_OCS,a) / fact(a);
printf("%2d: %f %.2f%%%s\n", a, P, P * 100.0, a == 0 ? " < probability of no occurance" : "" );
}
putchar('\n');
}

/* factorial: Product of all positive integers up to acertain value" */
double fact(int a)
{
int i;
double b = 1.0;
for(i=1; i <= a; i++)
b *= (double)i;
return b;
}

which outputs:


m = 5.5 (average number of random occurrences per interval)
Probability of occurrences in intervals :
------------------------------------------------------------------
0: 0.004087 0.41% < probability of no occurance
1: 0.022477 2.25%
2: 0.061812 6.18%
3: 0.113323 11.33%
4: 0.155819 15.58%
5: 0.171401 17.14%
6: 0.157117 15.71%
7: 0.123449 12.34%
8: 0.084871 8.49%
9: 0.051866 5.19%
10: 0.028526 2.85%
11: 0.014263 1.43%
12: 0.006537 0.65%
13: 0.002766 0.28%
14: 0.001087 0.11%
15: 0.000398 0.04%
16: 0.000137 0.01%
17: 0.000044 0.00%
18: 0.000014 0.00%
19: 0.000004 0.00%


what i'd like to know is can this be viewed a bit differently than intended?

m, according to where i got this equation from in the first place (linked to in earlier in thread), is "the average number of random occurrences per interval"

is it possible / ok to switch m to refer to something else: intervals? apply m to mean which interval, rather than occurances in an interval? ok? possible?

so at the moment the 0,1,2,3,4 in the above list refer to the total occurances in one particular interval: that's the official correct version as it stands. i'd like to use this for characters in text, and character in text: it's impossible to have 2 occurances of the letter 'e' at one interval for example. it's also impossible to have 2 occurance of any character at one interval. in text there's one character per interval - that's absolute. the question is which character occurs at each interval.

so m in the above example is 5.5, which is a reasonable value for the avarage occurance of the letter e. if say you were looking at an e that has occured in a particular position in the text, would it be ok to apply 0, 1, 2, 3 to the positions of text from the position of that e? so 8 would refer to 8 character positions along from where the current e we're on is, if you see what i mean?

just seems a bit dodgy repurposing what m means, but then on the other seems ok. i'm not sure. anyone?

I get the impression that you don't quite understand the poission distribution - it is a counting distribution, used to model the probabilities of differing counts resulting from a random process.

For example, how many heads you can expect to count in tossing a coin a hundred times is poission with m, as in your link, equal to 50. The function you included then gives the probabilities that the number of heads is any given number, such as 48 / 49 / 50 / 51 / 52 / etc.

If you want to go further and model such things as the average number of consecutive heads or the average length between heads then you'll need more than the poisson distribution (I can't remember what the approach is here but i'm sure a quick search will get you started)

I get the impression that you don't quite understand the poission distribution
not entirely, no, i don't think

it is a counting distribution, used to model the probabilities of differing counts resulting from a random process.
yeah, i understand that though. also it does it in quite a natural way. and characters are something you can count and occur in a random process. there's no reason you can't use a poisson function with character frequencies in text.

what i'm asking above isn't a cumalitive question. i'm not hoping to get this to take into account likelyhood on a cumalative basis (yet), if you see what i mean: like there hasn't been an e for 10 characters now, therefore the likelyhood of an e has risen steeply and is very high now <<< that isn't what i'm asking or trying to get at here.


basically, instead of using the equation to tell you the likelyhood of the number of occurances at an interval, rather the number of intervals that might pass before an occurance.

eg: say you had a very narrow vision of the text and could only see one character, say the character at position 40. and that was an e. and had previously found out that e's occur on avarage every 5.5 characters. then would it be cool to apply the results from the c code above and say "the likelyhood of an e character being 3 character positions away (meaning position 43) is 11.33%, and the likelyhood of 11 character positions away (51) being an e is 1.43%"? that's seems ok to me. isn't it?

again, nothing cumalitive going on, yet.

does anyone think it's ok to view m slightly differently, like this: "avarage number of intervals per random occurance"

rather than: "average number of random occurrences per interval" which is what it is at the moment?

I'm not entirely sure that even makes sense... for each particular random occurence there's only one interval to which it belongs.... :-|

even if it does , these distributions (as I remember) are generally considered memoryless and so you can't work in prior information such as how many characters ago the last e occured etc.

what are you trying to do anyway? text compression?

I'm not entirely sure that even makes sense... for each particular random occurence there's only one interval to which it belongs.... :-|
yes, i completely agree with you there. i didn't say something that goes against that, at least i didn't intend to. you're refering to...:

"avarage number of intervals per random occurance"

...i think. yup, ok i've said it badly. "avarage number of intervals that pass before you get a random occurance". basically, i think, i'm hoping to use m to count intervals rather than events.

you've got e's that occur on avarage say, every 5.5 intervals. with the original function (the way it's used in where i got it from linked to above) the intention behind the function says m is the avarage number of events per interval. that's obviously rediculous with text because you can't have more than one character per interval.


even if it does , these distributions (as I remember) are generally considered memoryless and so you can't work in prior information such as how many characters ago the last e occured etc.
i realise the poisson function in itself, on it's own, won't do that for me, but that additional part you mention is very easy to do with usual c programming around / extra to the poisson function: no problem, honest.

what are you trying to do anyway? text compression?
text analysis, i think it is.

i've posted this question on a maths forum but no-one's answered, but then no-one ever answers me on that particular board! :) but i wrote the question differently there. this is the main chunk from that in case that explains it better than i have already (sorry this is getting *so* long :/ and i honestly have quite a strong suspician the answer may be very simple? i don't know) :


simple, short version:

is it possible, instead of using the equation to give the likelyhood of the number of event occurances in an interval (as it is at the moment), rather the number of intervals that might pass before an event occurance ?


wordy version:

the way the poisson function is used originally, from where i got it from, the variable m (0...19 in the above results) represents: the number of likely occurances of an event in an interval. so 6, represents 6 events occuring in an interval (emphasis: in 1 single interval).

but that doesn't tally up with what i'd like to use it for. here's a description of the situation that i'd like to apply the poisson function to:

1. each interval in space are identicle in size to each other (which is the same as in the original situation).
2. events, if any, fall exactly into intervals - an event can't be half in one interval and half in another (* see note).
3. there can't be more than one event in an interval (different to the original).

(* point 2 maybe irrelevent, or no different from the original in any case. i'm not sure. there's a diagram giving a visual representation of the original being applied on this page with 50 intervals and 50 events: http://info.bio.cmu.edu/courses/03438/pbc97poisson/poissonpage.html but i think possibly the whole of point 2 may be irrelevent but i've included it just to be safe. - probably can completely ignore point 2.)

the main difference between the original situation and the one i'd like to apply the poisson formula to is this: there's absolutely only 2 possible states an interval can be in: no event, or an event.

(btw what i've described there is normal text, and an interval is a character position and an event is a particular pre-chosen character, say an 'e'. so a non event would be any character other than an 'e', the pre-chosen character. and m, the avarage occurance of the chosen character in the text.)

applying the results from the c code above, to the situation i've just described, and say the first interval (interval number 1) contains an event: would it then be ok / correct to say "the likelyhood of an event that occurs on avarage every 5.5 intervals at say, interval number 3, is 11.33%. and the likelyhood of an event being at say, position 11, is 1.43%" ?

sorry if i'm repeating myself a bit here but i'll try again anyway.
what i'm trying to say is that the poission distribution models a count of the number of events occurring within a given interval.

what you want to do, as far as i can judge, is to model the length of interval between events - this is, as i remember not what the poission distribution is used for - the exponential distribution models this.

what i'm trying to say is that the poission distribution models a count of the number of events occurring within a given interval.
ok. the 'e', which is the event, occurs on avarage per interval 0.1794872 times - the way that's now stated fits with what you've said there.


m = 0.179487 (average number of random occurrences per interval)
Probability of occurrences in intervals :
------------------------------------------------------------------
0: 0.835699 83.57% < probability of no occurance
1: 0.149997 15.00%
2: 0.013461 1.35%
3: 0.000805 0.08%
4: 0.000036 0.00%
5: 0.000001 0.00%
6: 0.000000 0.00%
7: 0.000000 0.00%
8: 0.000000 0.00%
9: 0.000000 0.00%
10: 0.000000 0.00%

but that isn't much use really :/ not that i can see anyway. oh well.

well, if it doesn't fit or work, it doesn't work. fair enough. although i'm still not *completely* convinced a poisson formula can't be used for this in some way, but i see your point definetely.


what you want to do, as far as i can judge, is to model the length of interval between events - this is, as i remember not what the poission distribution is used for - the exponential distribution models this.
yeah, or the number of intervals more precisely.

ok, i will look into exponential distribution models.

thanks very much for the info and reply :)

no probs on my part... let us know how ya get on... ;) :)










privacy (GDPR)