statistic analyzer
Sponsored Links
Sponsored Links
Secleted [ 0 ] software to compare
Results 1 - 15 of about 743
Statistics::Candidates 0.9
Candidates is a Perl5 module for manipulating candidate features (help module for Statistics::MaxEntropy). more>>
Candidates is a Perl5 module for manipulating candidate features (help module for Statistics::MaxEntropy).
SYNOPSIS
use Statistics::Candidates;
# create a new candidates object and read candidate features
$candidates = Statistics::Candidates->new($some_file);
# checks for constant candidate features
$candidates->check();
# writes candidates that were not added to a file
$candidates->write($some_other_file);
# clear the administration about being added or not ...
$candidates->clear();
The Candidates object is for storage, retrieval, and manipulation of candidate features.
The reason for separating this code from Maxentropy.pm is that a set of candidate features should be considered a separate object. Blessing them into MaxEntropy would be unnatural.
Also this code is simpler and more stable than that in the MaxEntropy module.
This module requires Bit::SparseVector.
<<lessSYNOPSIS
use Statistics::Candidates;
# create a new candidates object and read candidate features
$candidates = Statistics::Candidates->new($some_file);
# checks for constant candidate features
$candidates->check();
# writes candidates that were not added to a file
$candidates->write($some_other_file);
# clear the administration about being added or not ...
$candidates->clear();
The Candidates object is for storage, retrieval, and manipulation of candidate features.
The reason for separating this code from Maxentropy.pm is that a set of candidate features should be considered a separate object. Blessing them into MaxEntropy would be unnatural.
Also this code is simpler and more stable than that in the MaxEntropy module.
This module requires Bit::SparseVector.
Download (0.041MB)
Added: 2006-11-02 License: Perl Artistic License Price:
1086 downloads
SendmailAnalyzer 3.0
Sendmail Analyzer is a Perl script reporting full HTML and graph sendmail usage reports. more>>
Sendmail Analyzer is a perl script reporting full HTML and graph sendmail usage reports. The project reports statistics on inbound, outbound and largest messages, senders and recipients, relays, domains, and complete mailboxes usage if your country law allow it.
Statistics are generated per hour, day, month and year. Graphs are in PNG format.
<<lessStatistics are generated per hour, day, month and year. Graphs are in PNG format.
Download (0.023MB)
Added: 2007-06-27 License: Perl Artistic License Price:
850 downloads
Practical Query Analyzer 1.6
Practical Query Analyzer produces HTML reports on query statistics. more>>
Practical Query Analyzer produces HTML reports on the most frequent queries, slowest queries, queries by type (select/insert/update/delete), and more for both PostgreSQL and MySQL log files.
<<less Download (0.05MB)
Added: 2005-11-28 License: BSD License Price:
1428 downloads
Network Traffic Analyzer 1.0
Network Traffic Analyzer analyzes the network traffic on multiple network devices and creates HTML statistics. more>>
Network Traffic Analyzer analyzes the network traffic on multiple network devices and creates HTML statistics with some network usage graphs. Sometimes it is good to know, how the network is used, how many bytes were received and how many bytes were sent.Therefore, here is Network Traffic Analyzer, which creates simple network usage statistics.
Such statistics can tell you, how good your network connection really is (who cares about what Internet provider say, when was the network down, which time is the best time for downloading large packages of data etc. etc. Or with this software you can just better imagine, how many traffic can your home computer generate.
<<lessSuch statistics can tell you, how good your network connection really is (who cares about what Internet provider say, when was the network down, which time is the best time for downloading large packages of data etc. etc. Or with this software you can just better imagine, how many traffic can your home computer generate.
Download (0.026MB)
Added: 2006-06-29 License: GPL (GNU General Public License) Price:
1233 downloads
Statistics::GaussHelmert 0.05
Statistics::GaussHelmert is a general weighted least squares estimation module. more>>
Statistics::GaussHelmert is a general weighted least squares estimation module.
SYNOPSIS
use Statistics::GaussHelmert;
# create an empty model
my $estimation = new Statistics::GaussHelmert;
# setup the model given observations $y, covariance matrices
# $Sigma_yy, an initial guess $b0 for the unknown parameters.
$estimation->observations($y);
$estimation->covariance_observations($Sigma_yy);
$estimation->initial_guess($b0);
# specify the implicit model function and its Jacobians by using
# closures.
$estimation->observation_equations(sub { ... });
$estimation->Jacobian_unknowns(sub { ... });
$estimation->Jacobian_observations(sub { ... });
# Maybe we want to impose some constraints on the unknown
# parameters, this is not mandatory
$estimation->constraints(sub { ... });
$estimation->Jacobian_constraints(sub { ... });
# start estimation
$estimation->start(verbose => 1);
# print result
print $estimation->estimated_unknown(),
$estimation->covariance_unknown();
This module is a flexible tool for estimating model parameters given a set of observations. The module provides function for a linear estimation model, the underlying model is called Gauss-Helmert model.
Statistics::GaussHelmert is different to modules such as Statistics::OLS in the sense that it may fit arbitrary functions to data of any dimensions. You have to specify an implicit minimization function (in contrast to explicit functions as in traditional regression methods) and its derivatives with respects to the unknown and the observations. You may also specify constraint function on the unknowns (with its derivative). Furthermore you already need an approximate solution. For some problems it is easy finding approximate solutions by directly solving for the unknown parameters with some well chosen observations.
<<lessSYNOPSIS
use Statistics::GaussHelmert;
# create an empty model
my $estimation = new Statistics::GaussHelmert;
# setup the model given observations $y, covariance matrices
# $Sigma_yy, an initial guess $b0 for the unknown parameters.
$estimation->observations($y);
$estimation->covariance_observations($Sigma_yy);
$estimation->initial_guess($b0);
# specify the implicit model function and its Jacobians by using
# closures.
$estimation->observation_equations(sub { ... });
$estimation->Jacobian_unknowns(sub { ... });
$estimation->Jacobian_observations(sub { ... });
# Maybe we want to impose some constraints on the unknown
# parameters, this is not mandatory
$estimation->constraints(sub { ... });
$estimation->Jacobian_constraints(sub { ... });
# start estimation
$estimation->start(verbose => 1);
# print result
print $estimation->estimated_unknown(),
$estimation->covariance_unknown();
This module is a flexible tool for estimating model parameters given a set of observations. The module provides function for a linear estimation model, the underlying model is called Gauss-Helmert model.
Statistics::GaussHelmert is different to modules such as Statistics::OLS in the sense that it may fit arbitrary functions to data of any dimensions. You have to specify an implicit minimization function (in contrast to explicit functions as in traditional regression methods) and its derivatives with respects to the unknown and the observations. You may also specify constraint function on the unknowns (with its derivative). Furthermore you already need an approximate solution. For some problems it is easy finding approximate solutions by directly solving for the unknown parameters with some well chosen observations.
Download (0.013MB)
Added: 2007-07-08 License: Perl Artistic License Price:
838 downloads
Python Traffic Camera Analyzer
Python Traffic Camera Analyzer is an automated traffic camera congestion analysis tool. more>>
Python Traffic Analyzer is a Python base class and sample driver script written to retrieve and manipulate images from the TrafficLand cameras and calculate a numeric value representing the current traffic flow.
PyTrAn, an example driver script, an image collector and an image mask creator are available for download from the link shown at the bottom. To use the PyTrAn package begin by choosing a camera that you wish to analyze, for this example well use the camera captioned above.
We want to construct a mask over the area of the image that we are interested in, namely the road. In this particular example the road takes up the majority of the image but that is not always the case.
We will apply the mask over captured images to fine tune the area over which we are looking for movement. To create the mask we will first need to collect a sequential series of snapshots from the target camera. The image_collector.py script was written for this task:
$ mkdir mask_200003
$ cd mask_200003
$ ../image_collector.py 200003 30
Collecting 30 images...
30
Done.
The script is hard coded to capture images on a 2-second delay. The delay is necessary to ensure the image has changed. I believe 2-seconds to be the absolute minimum. Once complete, 30 images numbered 1 through 30 will be created in the current directory.
We construct a mask from these captured images by creating a diff-image for each sequential image pair and then adding each diff-image together. Naturally, a script was written to automate this task as well:
$ ../mask_maker.py 1 30
Creating a diff for each sequential image pair.
Diffing 29
Creating the initial mask from the first image pair.
Adding the rest of the diffs to the mask.
Masking 29
Done.
A number of .diff files are generated in this process. These files repesent the movement between individual sequence pairs.
The .diff files are simply intermediary files, the important bit is the mask file, which is generated as the sum of all differences.
The mask file may be dirty (as in this case) and require manual cleanup. The basic shape of the road however is clearly visible, evidence that we can with minimal effort automate the mask generation process. Also, this run was conducted at night, day-time images yield better results.
There are a few final steps we need to take before we can use the example PyTrAn driver script. First we need to convert the mask to ASCII (noraw) format:
$ pnmnoraw mask > mask_200003.ascii
Then we need to open an ImageMagick display window and get its X-window-ID using xwininfo. Finally, update camera_id and window_id in pytran_sampling.py and launch the driver:
$ ../pytran_sampling.py
DEBUG> grabbing frame from camera 200003
DEBUG> rotating image: pytran.this > pytran.last
DEBUG> refreshing image in 3 secs
taking a 5 minute sample at various thresholds.
DEBUG> grabbing frame from camera 200003
DEBUG> generating frame diff on pytran.last, pytran.this
DEBUG> displaying image: pytran.diff
DEBUG> converting pytran.diff to ascii
DEBUG> calculating traffic ratio...
ratio[5]: 55%
DEBUG> calculating traffic ratio...
ratio[10]: 52%
...
...
5 minute sample[5]: 67.88
5 minute sample[10]: 42.66
5 minute sample[15]: 30.57
5 minute sample[20]: 23.03
5 minute sample[25]: 18.39
5 minute sample[30]: 14.79
5 minute sample[35]: 12.42
5 minute sample[40]: 10.53
5 minute sample[45]: 9.06
5 minute sample[50]: 7.85
The sampling script will take 5 minute samples at varying color thresholds. The optimal threshold must be manually chosen. Furthermore, you will need to sample the traffic ratios during both heavy and light traffic times to get a good feel for your acceptable range. Also, keep in mind that the traffic ratio value is simply the percent change detected, or in other words the movement detected within the masked region. This means that a completely empty road will register similar values to a road so congested it looks like a parking lot. The time of day can be combined with the traffic ration to determine the logical truth.
With this task implemented and abstracted more complex systems can be built. When I find the time Id like to create a system that will take multiple potential travel routes and times, and during the travel time e-mail the traveler with the best route to take. Another idea I had would be to record the traffic flow values for each camera, for each day and for each half hour interval. Travelers and other interested parties can then analyze traffic patterns to determine the fastest route dependant on date/time.
<<lessPyTrAn, an example driver script, an image collector and an image mask creator are available for download from the link shown at the bottom. To use the PyTrAn package begin by choosing a camera that you wish to analyze, for this example well use the camera captioned above.
We want to construct a mask over the area of the image that we are interested in, namely the road. In this particular example the road takes up the majority of the image but that is not always the case.
We will apply the mask over captured images to fine tune the area over which we are looking for movement. To create the mask we will first need to collect a sequential series of snapshots from the target camera. The image_collector.py script was written for this task:
$ mkdir mask_200003
$ cd mask_200003
$ ../image_collector.py 200003 30
Collecting 30 images...
30
Done.
The script is hard coded to capture images on a 2-second delay. The delay is necessary to ensure the image has changed. I believe 2-seconds to be the absolute minimum. Once complete, 30 images numbered 1 through 30 will be created in the current directory.
We construct a mask from these captured images by creating a diff-image for each sequential image pair and then adding each diff-image together. Naturally, a script was written to automate this task as well:
$ ../mask_maker.py 1 30
Creating a diff for each sequential image pair.
Diffing 29
Creating the initial mask from the first image pair.
Adding the rest of the diffs to the mask.
Masking 29
Done.
A number of .diff files are generated in this process. These files repesent the movement between individual sequence pairs.
The .diff files are simply intermediary files, the important bit is the mask file, which is generated as the sum of all differences.
The mask file may be dirty (as in this case) and require manual cleanup. The basic shape of the road however is clearly visible, evidence that we can with minimal effort automate the mask generation process. Also, this run was conducted at night, day-time images yield better results.
There are a few final steps we need to take before we can use the example PyTrAn driver script. First we need to convert the mask to ASCII (noraw) format:
$ pnmnoraw mask > mask_200003.ascii
Then we need to open an ImageMagick display window and get its X-window-ID using xwininfo. Finally, update camera_id and window_id in pytran_sampling.py and launch the driver:
$ ../pytran_sampling.py
DEBUG> grabbing frame from camera 200003
DEBUG> rotating image: pytran.this > pytran.last
DEBUG> refreshing image in 3 secs
taking a 5 minute sample at various thresholds.
DEBUG> grabbing frame from camera 200003
DEBUG> generating frame diff on pytran.last, pytran.this
DEBUG> displaying image: pytran.diff
DEBUG> converting pytran.diff to ascii
DEBUG> calculating traffic ratio...
ratio[5]: 55%
DEBUG> calculating traffic ratio...
ratio[10]: 52%
...
...
5 minute sample[5]: 67.88
5 minute sample[10]: 42.66
5 minute sample[15]: 30.57
5 minute sample[20]: 23.03
5 minute sample[25]: 18.39
5 minute sample[30]: 14.79
5 minute sample[35]: 12.42
5 minute sample[40]: 10.53
5 minute sample[45]: 9.06
5 minute sample[50]: 7.85
The sampling script will take 5 minute samples at varying color thresholds. The optimal threshold must be manually chosen. Furthermore, you will need to sample the traffic ratios during both heavy and light traffic times to get a good feel for your acceptable range. Also, keep in mind that the traffic ratio value is simply the percent change detected, or in other words the movement detected within the masked region. This means that a completely empty road will register similar values to a road so congested it looks like a parking lot. The time of day can be combined with the traffic ration to determine the logical truth.
With this task implemented and abstracted more complex systems can be built. When I find the time Id like to create a system that will take multiple potential travel routes and times, and during the travel time e-mail the traveler with the best route to take. Another idea I had would be to record the traffic flow values for each camera, for each day and for each half hour interval. Travelers and other interested parties can then analyze traffic patterns to determine the fastest route dependant on date/time.
Download (0.003MB)
Added: 2005-05-20 License: GPL (GNU General Public License) Price:
1620 downloads
WiFi Statistics Daemon 1.0a
wifistatd is a script which generates a PNG graphing signal/noise/link levels on a selected wireless interface. more>>
wifistatd is a script which generates a PNG graphing signal/noise/link levels on a selected wireless interface.
To install wifistatd on a UNIX machine untar the archive with program.
Then you must type:
./wifistatd.pl install
If everything went OK (it should), youll get the db.rrd database file in your current working directory.
To configure daemon edit the head part of wifistatd.pl.
getting_started
To start, just type:
./wifistatd.pl start
To stop, just type:
./wifistatd.pl stop
<<lessTo install wifistatd on a UNIX machine untar the archive with program.
Then you must type:
./wifistatd.pl install
If everything went OK (it should), youll get the db.rrd database file in your current working directory.
To configure daemon edit the head part of wifistatd.pl.
getting_started
To start, just type:
./wifistatd.pl start
To stop, just type:
./wifistatd.pl stop
Download (0.009MB)
Added: 2006-06-27 License: GPL (GNU General Public License) Price:
1216 downloads
Statistics::Gap 0.10
Statistics::Gap Perl module is an adaptation of the Gap Statistic. more>>
Statistics::Gap Perl module is an adaptation of the Gap Statistic.
SYNOPSIS
use Statistics::Gap;
$predictedk = &gap("prefix", "vec", INPUTMATRIX, "rbr", "h2", 30, 10, rep, 90, 4);
OR
use Statistics::Gap;
$predictedk = &gap("prefix", "vec", INPUTMATRIX, "rbr", "h2", 30, 10, rep, 90, 4, 7);
INPUTS
1. Prefix: The string that should be used to as a prefix while naming the intermediate files and the .dat files (plot files).
2. Space: Specifies the space in which the clustering should be performed. Valid parameter values: vec - vector space sim - similarity space
3. InputMatrix: Path to input matrix file. (More details about the input file-format below.)
4. ClusteringMethod: Specifies the clustering method to be used. (Learn more about this at: http://glaros.dtc.umn.edu/gkhome/cluto/cluto/overview)
Valid parameter values:
rb - Repeated Bisections
rbr - Repeated Bisections for by k-way refinement
direct - Direct k-way clustering
agglo - Agglomerative clustering
bagglo - Partitional biased Agglomerative clustering
NOTE: bagglo can be used only if space=vec
5. Crfun: Specifies the criterion function to be used for finding clustering solutions. (Learn more about this at: http://glaros.dtc.umn.edu/gkhome/cluto/cluto/overview)
Valid parameter values:
i1 - I1 Criterion function
i2 - I2 Criterion function
e1 - E1 Criterion function
h1 - H1 Criterion function
h2 - H2 Criterion function
6. K: This is an approximate upper bound for the number of clusters that may be present in the dataset.
7. B: The number of replicates/references to be generated.
8. TypeRef: Specifies whether to generate B replicates from a reference or to generate B references.
Valid parameter values:
rep - replicates
ref - references
9. Percentage: Specifies the percentage confidence to be reported in the log file. Since Statistics::Gap uses parametric bootstrap method for reference distribution generation, it is critical to understand the interval around the sample mean that could contain the population ("true") mean and with what certainty.
10. Precision: Specifies the precision to be used while generating the reference distribution.
11. Seed: The seed to be used with the random number generator. (This is an optional parameter. By default no seed is set.)
<<lessSYNOPSIS
use Statistics::Gap;
$predictedk = &gap("prefix", "vec", INPUTMATRIX, "rbr", "h2", 30, 10, rep, 90, 4);
OR
use Statistics::Gap;
$predictedk = &gap("prefix", "vec", INPUTMATRIX, "rbr", "h2", 30, 10, rep, 90, 4, 7);
INPUTS
1. Prefix: The string that should be used to as a prefix while naming the intermediate files and the .dat files (plot files).
2. Space: Specifies the space in which the clustering should be performed. Valid parameter values: vec - vector space sim - similarity space
3. InputMatrix: Path to input matrix file. (More details about the input file-format below.)
4. ClusteringMethod: Specifies the clustering method to be used. (Learn more about this at: http://glaros.dtc.umn.edu/gkhome/cluto/cluto/overview)
Valid parameter values:
rb - Repeated Bisections
rbr - Repeated Bisections for by k-way refinement
direct - Direct k-way clustering
agglo - Agglomerative clustering
bagglo - Partitional biased Agglomerative clustering
NOTE: bagglo can be used only if space=vec
5. Crfun: Specifies the criterion function to be used for finding clustering solutions. (Learn more about this at: http://glaros.dtc.umn.edu/gkhome/cluto/cluto/overview)
Valid parameter values:
i1 - I1 Criterion function
i2 - I2 Criterion function
e1 - E1 Criterion function
h1 - H1 Criterion function
h2 - H2 Criterion function
6. K: This is an approximate upper bound for the number of clusters that may be present in the dataset.
7. B: The number of replicates/references to be generated.
8. TypeRef: Specifies whether to generate B replicates from a reference or to generate B references.
Valid parameter values:
rep - replicates
ref - references
9. Percentage: Specifies the percentage confidence to be reported in the log file. Since Statistics::Gap uses parametric bootstrap method for reference distribution generation, it is critical to understand the interval around the sample mean that could contain the population ("true") mean and with what certainty.
10. Precision: Specifies the precision to be used while generating the reference distribution.
11. Seed: The seed to be used with the random number generator. (This is an optional parameter. By default no seed is set.)
Download (2.5MB)
Added: 2007-05-23 License: Perl Artistic License Price:
884 downloads
Statistics::OLS 0.07
Statistics::OLS is a Perl module to perform ordinary least squares and associated statistics. more>>
Statistics::OLS is a Perl module to perform ordinary least squares and associated statistics.
SYNOPSIS
use Statistics::OLS;
my $ls = Statistics::OLS->new();
$ls->setData (@xydataset) or die( $ls->error() );
$ls->setData (@xdataset, @ydataset);
$ls->regress();
my ($intercept, $slope) = $ls->coefficients();
my $R_squared = $ls->rsq();
my ($tstat_intercept, $tstat_slope) = $ls->tstats();
my $sigma = $ls->sigma();
my $durbin_watson = $ls->dw();
my $sample_size = $ls->size();
my ($avX, $avY) = $ls->av();
my ($varX, $varY, $covXY) = $ls->var();
my ($xmin, $xmax, $ymin, $ymax) = $ls->minMax();
# returned arrays are x-y or y-only data
# depending on initial call to setData()
my @predictedYs = $ls->predicted();
my @residuals = $ls->residuals();
I wrote Statistics::OLS to perform Ordinary Least Squares (linear curve fitting) on two dimensional data: y = a + bx. The other simple statistical module I found on CPAN (Statistics::Descriptive) is designed for univariate analysis. It accomodates OLS, but somewhat inflexibly and without rich bivariate statistics. Nevertheless, it might make sense to fold OLS into that module or a supermodule someday.
Statistics::OLS computes the estimated slope and intercept of the regression line, their T-statistics, R squared, standard error of the regression and the Durbin-Watson statistic. It can also return the residuals.
It is pretty simple to do two dimensional least squares, but much harder to do multiple regression, so OLS is unlikely ever to work with multiple independent variables.
This is a beta code and has not been extensively tested. It has worked on a few published datasets. Feedback is welcome, particularly if you notice an error or try it with known results that are not reproduced correctly.
<<lessSYNOPSIS
use Statistics::OLS;
my $ls = Statistics::OLS->new();
$ls->setData (@xydataset) or die( $ls->error() );
$ls->setData (@xdataset, @ydataset);
$ls->regress();
my ($intercept, $slope) = $ls->coefficients();
my $R_squared = $ls->rsq();
my ($tstat_intercept, $tstat_slope) = $ls->tstats();
my $sigma = $ls->sigma();
my $durbin_watson = $ls->dw();
my $sample_size = $ls->size();
my ($avX, $avY) = $ls->av();
my ($varX, $varY, $covXY) = $ls->var();
my ($xmin, $xmax, $ymin, $ymax) = $ls->minMax();
# returned arrays are x-y or y-only data
# depending on initial call to setData()
my @predictedYs = $ls->predicted();
my @residuals = $ls->residuals();
I wrote Statistics::OLS to perform Ordinary Least Squares (linear curve fitting) on two dimensional data: y = a + bx. The other simple statistical module I found on CPAN (Statistics::Descriptive) is designed for univariate analysis. It accomodates OLS, but somewhat inflexibly and without rich bivariate statistics. Nevertheless, it might make sense to fold OLS into that module or a supermodule someday.
Statistics::OLS computes the estimated slope and intercept of the regression line, their T-statistics, R squared, standard error of the regression and the Durbin-Watson statistic. It can also return the residuals.
It is pretty simple to do two dimensional least squares, but much harder to do multiple regression, so OLS is unlikely ever to work with multiple independent variables.
This is a beta code and has not been extensively tested. It has worked on a few published datasets. Feedback is welcome, particularly if you notice an error or try it with known results that are not reproduced correctly.
Download (0.008MB)
Added: 2007-05-23 License: Perl Artistic License Price:
531 downloads
Statistics::LineFit 0.07
Statistics::LineFit module least squares line fit, weighted or unweighted. more>>
Statistics::LineFit module least squares line fit, weighted or unweighted.
SYNOPSIS
use Statistics::LineFit;
$lineFit = Statistics::LineFit->new();
$lineFit->setData (@xValues, @yValues) or die "Invalid data";
($intercept, $slope) = $lineFit->coefficients();
defined $intercept or die "Cant fit line if x values are all equal";
$rSquared = $lineFit->rSquared();
$meanSquaredError = $lineFit->meanSqError();
$durbinWatson = $lineFit->durbinWatson();
$sigma = $lineFit->sigma();
($tStatIntercept, $tStatSlope) = $lineFit->tStatistics();
@predictedYs = $lineFit->predictedYs();
@residuals = $lineFit->residuals();
(varianceIntercept, $varianceSlope) = $lineFit->varianceOfEstimates();
The Statistics::LineFit module does weighted or unweighted least-squares line fitting to two-dimensional data (y = a + b * x). (This is also called linear regression.) In addition to the slope and y-intercept, the module can return the square of the correlation coefficient (R squared), the Durbin-Watson statistic, the mean squared error, sigma, the t statistics, the variance of the estimates of the slope and y-intercept, the predicted y values and the residuals of the y values. (See the METHODS section for a description of these statistics.)
The module accepts input data in separate x and y arrays or a single 2-D array (an array of arrayrefs). The optional weights are input in a separate array. The module can optionally verify that the input data and weights are valid numbers. If weights are input, the line fit minimizes the weighted sum of the squared errors and the following statistics are weighted: the correlation coefficient, the Durbin-Watson statistic, the mean squared error, sigma and the t statistics.
The module is state-oriented and caches its results. Once you call the setData() method, you can call the other methods in any order or call a method several times without invoking redundant calculations. After calling setData(), you can modify the input data or weights without affecting the modules results.
The decision to use or not use weighting could be made using your a priori knowledge of the data or using supplemental data. If the data is sparse or contains non-random noise, weighting can degrade the solution. Weighting is a good option if some points are suspect or less relevant (e.g., older terms in a time series, points that are known to have more noise).
<<lessSYNOPSIS
use Statistics::LineFit;
$lineFit = Statistics::LineFit->new();
$lineFit->setData (@xValues, @yValues) or die "Invalid data";
($intercept, $slope) = $lineFit->coefficients();
defined $intercept or die "Cant fit line if x values are all equal";
$rSquared = $lineFit->rSquared();
$meanSquaredError = $lineFit->meanSqError();
$durbinWatson = $lineFit->durbinWatson();
$sigma = $lineFit->sigma();
($tStatIntercept, $tStatSlope) = $lineFit->tStatistics();
@predictedYs = $lineFit->predictedYs();
@residuals = $lineFit->residuals();
(varianceIntercept, $varianceSlope) = $lineFit->varianceOfEstimates();
The Statistics::LineFit module does weighted or unweighted least-squares line fitting to two-dimensional data (y = a + b * x). (This is also called linear regression.) In addition to the slope and y-intercept, the module can return the square of the correlation coefficient (R squared), the Durbin-Watson statistic, the mean squared error, sigma, the t statistics, the variance of the estimates of the slope and y-intercept, the predicted y values and the residuals of the y values. (See the METHODS section for a description of these statistics.)
The module accepts input data in separate x and y arrays or a single 2-D array (an array of arrayrefs). The optional weights are input in a separate array. The module can optionally verify that the input data and weights are valid numbers. If weights are input, the line fit minimizes the weighted sum of the squared errors and the following statistics are weighted: the correlation coefficient, the Durbin-Watson statistic, the mean squared error, sigma and the t statistics.
The module is state-oriented and caches its results. Once you call the setData() method, you can call the other methods in any order or call a method several times without invoking redundant calculations. After calling setData(), you can modify the input data or weights without affecting the modules results.
The decision to use or not use weighting could be made using your a priori knowledge of the data or using supplemental data. If the data is sparse or contains non-random noise, weighting can degrade the solution. Weighting is a good option if some points are suspect or less relevant (e.g., older terms in a time series, points that are known to have more noise).
Download (0.024MB)
Added: 2007-07-12 License: Perl Artistic License Price:
835 downloads
Statistics::Hartigan 0.01
Statistics::Hartigan is a Perl extension for the stopping rule proposed by Hartigan J. Hartigan, J. (1975). more>>
Statistics::Hartigan is a Perl extension for the stopping rule proposed by Hartigan J. Hartigan, J. (1975). Clustering Algorithms. John Wiley and Sons, New York, NY, US.
SYNOPSIS
use Statistics::Hartigan;
&hartigan(InputFile, "agglo", 6, 10);
Input file is expected in the "dense" format -
Sample Input file:
6 5
1 1 0 0 1
1 0 0 0 0
1 1 0 0 1
1 1 0 0 1
1 0 0 0 1
1 1 0 0 1
Hartigan J. uses the Within Cluster/Group Sum of Squares (WGSS) to estimate the number of clusters a given data naturally falls into. The is goal is to minimize WG.
<<lessSYNOPSIS
use Statistics::Hartigan;
&hartigan(InputFile, "agglo", 6, 10);
Input file is expected in the "dense" format -
Sample Input file:
6 5
1 1 0 0 1
1 0 0 0 0
1 1 0 0 1
1 1 0 0 1
1 0 0 0 1
1 1 0 0 1
Hartigan J. uses the Within Cluster/Group Sum of Squares (WGSS) to estimate the number of clusters a given data naturally falls into. The is goal is to minimize WG.
Download (0.006MB)
Added: 2007-05-23 License: Perl Artistic License Price:
884 downloads
Visitors Web Log Analyzer 0.61
Visitors is a very fast Web log analyzer. more>>
Visitors is a very fast web log analyzer for Linux, Windows, and other Unix-like operating systems. It takes as input a web server log file, and outputs statistics in form of different reports. The design principles are very different compared to other software of the same type:
No installation required, can process up to 150,000 lines of log entries per second in fast computers (20MB/s with my log files average length).
Designed to be executed by the command line, output html and text reports. The text report can be used in pipe to less to check web stats from ssh.
Support for real time statistics with the Visitors Stream Mode introduced with version 0.3.
To specify the log format is not needed at all. Works out of box with apache and most other web servers with a standard log format (see the documentation for more information on the format).
Its a portable C program, can be compiled on many different systems. Binaries for Windows systems are in the Download section of this page.
The produced html report doesnt contain images or external CSS, is self-contained, you can send it by email to users.
Visitors is free software (and of course, freeware), under the terms of the GPL license. You dont need to pay to use it. Visitors is supported, if you want a custom version made directly by the original author for a modest price, contact me at antirez (at) invece.org. ISPs may take advantage of the high processing speed.
Main features:
- Requested pages.
- Requested images.
- Referers by hits and age.
- Unique visitors in each day.
- Page views per visit.
- Pages accessed by the Google crawler (and the date of googles last access on every page).
- Percentage of visits originated from Google searches for every day.
- Users navigation patterns (web trails).
- Keyphrases used in Google searches.
- User agents.
- Weekdays and Hours distributions of accesses.
- Weekdays/Hours combined bidimentional map.
- Month/Year combined bidimentional map.
- Visual path analysis with Graphviz.
- Operating systems, browsers and domains popularity.
- 404 errors.
Enhancements:
- This release adds an important bugfix in the unique visitors algorithm.
- The output is now nearer to reality (though unique visitors stats are always a guess without the use of a cookie).
<<lessNo installation required, can process up to 150,000 lines of log entries per second in fast computers (20MB/s with my log files average length).
Designed to be executed by the command line, output html and text reports. The text report can be used in pipe to less to check web stats from ssh.
Support for real time statistics with the Visitors Stream Mode introduced with version 0.3.
To specify the log format is not needed at all. Works out of box with apache and most other web servers with a standard log format (see the documentation for more information on the format).
Its a portable C program, can be compiled on many different systems. Binaries for Windows systems are in the Download section of this page.
The produced html report doesnt contain images or external CSS, is self-contained, you can send it by email to users.
Visitors is free software (and of course, freeware), under the terms of the GPL license. You dont need to pay to use it. Visitors is supported, if you want a custom version made directly by the original author for a modest price, contact me at antirez (at) invece.org. ISPs may take advantage of the high processing speed.
Main features:
- Requested pages.
- Requested images.
- Referers by hits and age.
- Unique visitors in each day.
- Page views per visit.
- Pages accessed by the Google crawler (and the date of googles last access on every page).
- Percentage of visits originated from Google searches for every day.
- Users navigation patterns (web trails).
- Keyphrases used in Google searches.
- User agents.
- Weekdays and Hours distributions of accesses.
- Weekdays/Hours combined bidimentional map.
- Month/Year combined bidimentional map.
- Visual path analysis with Graphviz.
- Operating systems, browsers and domains popularity.
- 404 errors.
Enhancements:
- This release adds an important bugfix in the unique visitors algorithm.
- The output is now nearer to reality (though unique visitors stats are always a guess without the use of a cookie).
Download (0.11MB)
Added: 2005-11-05 License: GPL (GNU General Public License) Price:
1458 downloads
Statistics::ChisqIndep 0.1
Statistics::ChisqIndep is a Perl module to perform chi-square test of independence (a.k.a. contingency tables). more>>
Statistics::ChisqIndep is a Perl module to perform chi-square test of independence (a.k.a. contingency tables).
Synopsis
#example for Statistics::ChisqIndep
use strict;
use Statistics::ChisqIndep;
use POSIX;
# input data in the form of the array of array references
my @obs = ([15, 68, 83], [23,47,65]);
my $chi = new Statistics::ChisqIndep;
$chi->load_data(@obs);
# print the summary data along with the contingency table
$chi->print_summary();
#print the contingency table only
$chi->print_contingency_table();
#the following output is the same as calling the function of print_summary
#all of the detailed info such as the expected values, degree of freedoms
#and totals are accessible as object globals
#check if the load_data() call is successful
if($chi->{valid}) {
print "Rows: ", $chi->{rows}, "n";
print "Columns: ", $chi->{cols}, "n";
print "Degree of Freedom: ", $chi->{df}, "n";
print "Total Count: ", $chi->{total}, "n";
print "Chi-square Statistic: ",
$chi->{chisq_statistic}, "n";
print "p-value: ", $chi->{p_value}, "n";
print "Warning:
some of the cell counts might be too low.n"
if ($chi->{warning});
#output the contingency table
my $rows = $chi->{rows}; # # rows
my $cols = $chi->{cols}; # # columns
my $obs = $chi->{obs}; # observed values
my $exp = $chi->{expected}; # expected values
my $rtotals = $chi->{rtotals}; # row totals
my $ctotals = $chi->{ctotals}; #column totals
my $total = $chi->{total}; # total counts
for (my $j = 0; $j < $cols; $j++) {
print "t",$j + 1;
}
print "trtotaln";
for (my $i = 0; $i < $rows; $i ++) {
print $i + 1, "t";
for(my $j = 0 ; $j < $cols; $j ++) {
#observed values can be accessed
#in the following way
print $obs->[$i]->[$j], "t";
}
#row totals can be accessed
# in the following way
print $rtotals->[$i], "n";
print "t";
for(my $j = 0 ; $j < $cols; $j ++) {
#expected values can be accessed
#in the following way
printf "(%.2f)t", $exp->[$i]->[$j];
}
print "n";
}
print "ctotalt";
for (my $j = 0; $j < $cols; $j++) {
#column totals can be accessed in the following way
print $ctotals->[$j], "t";
}
#output total counts
print $total, "n";
}
This is the module to perform the Pearsons Chi-squared test on contingency tables of 2 dimensions. The users input the observed values in the table form and the module will compute the expected values for each cell based on the independence hypothesis. The module will then compute the chi-square statistic and the corresponding p-value based on the observed and the expected values to test if the 2 dimensions are truly independent.
<<lessSynopsis
#example for Statistics::ChisqIndep
use strict;
use Statistics::ChisqIndep;
use POSIX;
# input data in the form of the array of array references
my @obs = ([15, 68, 83], [23,47,65]);
my $chi = new Statistics::ChisqIndep;
$chi->load_data(@obs);
# print the summary data along with the contingency table
$chi->print_summary();
#print the contingency table only
$chi->print_contingency_table();
#the following output is the same as calling the function of print_summary
#all of the detailed info such as the expected values, degree of freedoms
#and totals are accessible as object globals
#check if the load_data() call is successful
if($chi->{valid}) {
print "Rows: ", $chi->{rows}, "n";
print "Columns: ", $chi->{cols}, "n";
print "Degree of Freedom: ", $chi->{df}, "n";
print "Total Count: ", $chi->{total}, "n";
print "Chi-square Statistic: ",
$chi->{chisq_statistic}, "n";
print "p-value: ", $chi->{p_value}, "n";
print "Warning:
some of the cell counts might be too low.n"
if ($chi->{warning});
#output the contingency table
my $rows = $chi->{rows}; # # rows
my $cols = $chi->{cols}; # # columns
my $obs = $chi->{obs}; # observed values
my $exp = $chi->{expected}; # expected values
my $rtotals = $chi->{rtotals}; # row totals
my $ctotals = $chi->{ctotals}; #column totals
my $total = $chi->{total}; # total counts
for (my $j = 0; $j < $cols; $j++) {
print "t",$j + 1;
}
print "trtotaln";
for (my $i = 0; $i < $rows; $i ++) {
print $i + 1, "t";
for(my $j = 0 ; $j < $cols; $j ++) {
#observed values can be accessed
#in the following way
print $obs->[$i]->[$j], "t";
}
#row totals can be accessed
# in the following way
print $rtotals->[$i], "n";
print "t";
for(my $j = 0 ; $j < $cols; $j ++) {
#expected values can be accessed
#in the following way
printf "(%.2f)t", $exp->[$i]->[$j];
}
print "n";
}
print "ctotalt";
for (my $j = 0; $j < $cols; $j++) {
#column totals can be accessed in the following way
print $ctotals->[$j], "t";
}
#output total counts
print $total, "n";
}
This is the module to perform the Pearsons Chi-squared test on contingency tables of 2 dimensions. The users input the observed values in the table form and the module will compute the expected values for each cell based on the independence hypothesis. The module will then compute the chi-square statistic and the corresponding p-value based on the observed and the expected values to test if the 2 dimensions are truly independent.
Download (0.003MB)
Added: 2006-12-18 License: Perl Artistic License Price:
1040 downloads
KSqlAnalyzer 0.3.0
KSqlAnalyzer is a tool for easily accessing the data of a MS SQL database. more>>
KSqlAnalyzer is a tool for easily accessing the data of a MS SQL database. It is made for developing and testing new SQL queries direclty on the server.
The functionality and look/feel of KSqlAnalyzer are similar to SQL Query Analyzer by Microsoft.
KSqlAnalyzer uses parts of the TDS library, and the editor uses parts of the KWrite source code becaus of its brilliant sytnax highlighting.
Enhancements:
- Trigger viewing and editing, and a new Edit menu item (Clear).
<<lessThe functionality and look/feel of KSqlAnalyzer are similar to SQL Query Analyzer by Microsoft.
KSqlAnalyzer uses parts of the TDS library, and the editor uses parts of the KWrite source code becaus of its brilliant sytnax highlighting.
Enhancements:
- Trigger viewing and editing, and a new Edit menu item (Clear).
Download (0.70MB)
Added: 2006-07-15 License: GPL (GNU General Public License) Price:
1197 downloads
Network Traffic Analyser 0.2.2
Network Traffic Analyser provides a script-driven network traffic monitor. more>>
Network Traffic Analyser provides a script-driven network traffic monitor.
Network Traffic Analyser (formerly known as sniffer) is designed to be an extremely powerful, configurable, and versatile tool for monitoring network traffic.
It can be used as a plain sniffer, as a tool for accounting, dynamic firewall updates, and many more things.
It features scripting support and an event-driven architecture.
The idea behind this project is to create a powerful tool for playing around with network traffic. The basic concepts are simplicity and flexibility. Instead of building tool that does something specific (and does a good job at it), were trying to build a tool that will be able to do whatever you want it to do (and still be good at it :).
To put it very simply, you write a script that will be invoked for every packet that passes through your network. You write your scripts in a Tcl, fully blown, interpreted programming language. That fact guaranties that you wont be constrained in your creativity.
<<lessNetwork Traffic Analyser (formerly known as sniffer) is designed to be an extremely powerful, configurable, and versatile tool for monitoring network traffic.
It can be used as a plain sniffer, as a tool for accounting, dynamic firewall updates, and many more things.
It features scripting support and an event-driven architecture.
The idea behind this project is to create a powerful tool for playing around with network traffic. The basic concepts are simplicity and flexibility. Instead of building tool that does something specific (and does a good job at it), were trying to build a tool that will be able to do whatever you want it to do (and still be good at it :).
To put it very simply, you write a script that will be invoked for every packet that passes through your network. You write your scripts in a Tcl, fully blown, interpreted programming language. That fact guaranties that you wont be constrained in your creativity.
Download (0.11MB)
Added: 2007-02-22 License: GPL (GNU General Public License) Price:
983 downloads
Secleted [ 0 ] software to compare
Copyright Notice:
Software piracy is theft, Using crack, password, serial numbers, registration codes, key generators is illegal and prevent future software development. The above statistic analyzer search only lists software in full, demo and trial versions for free download. Download links are directly from our mirror sites or publisher sites, torrent files or links from rapidshare.com, yousendit.com or megaupload.com are not allowed