查找并删除在PHP中的异常值异常、PHP

2023-09-11 04:45:23 作者:你若安好我便安心

假设我品尝精选的数据库记录返回下面的数字:

Suppose I sample a selection of database records that return the following numbers:

20.50, 80.30, 70.95, 15.25, 99.97, 85.56, 69.77

是否有可在PHP被有效地实施,以找到异常值(如果有的话)从float数组基于它们与平均值有多远偏离一个算法

Is there an algorithm that can be efficiently implemented in PHP to find the outliers (if there are any) from an array of floats based on how far they deviate from the mean?

推荐答案

好让我们假设你有你的数据点在这样一个数组:

Ok let's assume you have your data points in an array like so:

<?php $dataset = array(20.50, 80.30, 70.95, 15.25, 99.97, 85.56, 69.77); ?>

然后你可以用下面的函数(请参见发生了什么评论),除去落在平均值之外的所有数+/-标准偏差倍大小设置(默认为1):

Then you can use the following function (see comments for what is happening) to remove all numbers that fall outside of the mean +/- the standard deviation times a magnitude you set (defaults to 1):

<?php

function remove_outliers($dataset, $magnitude = 1) {

  $count = count($dataset);
  $mean = array_sum($dataset) / $count; // Calculate the mean
  $deviation = sqrt(array_sum(array_map("sd_square", $dataset, array_fill(0, $count, $mean))) / $count) * $magnitude; // Calculate standard deviation and times by magnitude

  return array_filter($dataset, function($x) use ($mean, $deviation) { return ($x <= $mean + $deviation && $x >= $mean - $deviation); }); // Return filtered array of values that lie within $mean +- $deviation.
}

function sd_square($x, $mean) {
  return pow($x - $mean, 2);
} 

?>

有关您的例子此函数返回以下为1的大小:

For your example this function returns the following with a magnitude of 1:

Array
(
    [1] => 80.3
    [2] => 70.95
    [5] => 85.56
    [6] => 69.77
)