如何检查基于事件的日期,时间和持续时间在一个SQL表平均并发事件?事件、持续时间、平均、日期

2023-09-11 03:20:21 作者:帅到不像话

我有一组通话详细记录,并从这些记录,我应该确定每个系统平均同时正在进行的呼叫,每小时(以一分钟的precision)。如果我查询晚上7点至晚上8点,我应该看到的小时平均并发呼叫(平均为每分钟并发调用)小时内(每个系统)。

所以,我需要一种方法来检查活动呼叫的次数为7:00-7:01,7:​​1月7日:02,然后等这些平均数字。呼叫被认为是积极的,如果被检查的通话的时间和持续时间下降了当前分钟之内。

是什么使这更困难的是,它需要跨越SQL 7.0和SQL 2000(2000年某些功能不可用在7.0,如GetUTCTime()),如果我可以得到2000的工作,我会开心。

什么方法这个问题,我可以走?

我想到了通过分(60)小时循环被检查并补充说,下降的分之间,然后以某种方式交叉引用的持续时间,以确保通话的计数,在下午7:00开始,有一个电话300秒的时间显示激活7:04,但我无法想象如何处理这个问题。我试图找出一种方法来对体重特别分钟每次通话将告诉我,如果调用是分还是不中活跃,但未能拿出有效的解决方案。

这里的数据类型,因为我要对查询相同。我没有在模式中的任何控制(除了可能转换数据和插入到另一个表有更合适的数据类型等)。我提供了一些我认识的并发活动呼叫示例数据。

  CREATE TABLE纪录(
  秒CHAR(10),
  时间CHAR(4)
  日期炭(8),
  DUR INT,
  系统INT,
  端口INT,
)

--seconds是STIME值。它是秒,从UTC 1/1/1970 00:00:00到当前UTC时间的差异,我们把它作为一个标识符(如时代)。
--time是呼叫发生的时间。
--date是呼叫定的日子。
--dur是以秒的呼叫的持续时间。
--system是系统编号。
--port是系统(未特别有关这个问题)上的端口。

INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924228,1923,20090416,105,2,2)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239923455,1910,20090416,884,1,97)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924221,1923,20090416,116,2,15)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924259,1924,20090416,90,1,102)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239923458,1910,20090416,891,2,1)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924255,1924,20090416,99,2,42)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924336,1925,20090416,20,2,58)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924293,1924,20090416,64,2,41)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239923472,1911,20090416,888,2,27)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924347,1925,20090416,25,1,100)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924301,1925,20090416,77,2,55)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924332,1925,20090416,52,2,43)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924240,1924,20090416,151,1,17)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924313,1925,20090416,96,2,62)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924094,1921,20090416,315,2,16)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239923643,1914,20090416,788,2,34)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924447,1927,20090416,6,2,27)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924342,1925,20090416,119,2,15)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924397,1926,20090416,76,2,41)
INSERT INTO记录(秒,时间,日期,杜尔,系统,端口)VALUES(1239924457,1927,20090416,23,2,27)
 

解决方案

我觉得MarkusQ有答案,但让我发展,你可能会发现更容易使用的替代品。我会用我的customary开发此为一系列简单的转换的观点中,functional分解的程序语言。

首先,让我们把一切都在普通单位。回想一下,记录的专栏取值是秒以来的时代的,半夜1 1970年1月我们可以找到的秒数,因为呼叫的一天午夜,该呼叫发生,通过只考虑弹性模量的秒数一天: S%(60 * 60 * 24)

  SELECT *,
S%(60 * 60 * 24)start_secs_from_midnight,
S%(60 * 60 * 24)+杜尔 -  1为end_secs_from_midnight,
;
 

我们从 S + DUR 减去一个,因为,在12:00:00开始一秒钟的电话也结束于12:00:00。

Java并发编程 JDK并发包 学习笔记 持续更新ing

我们可以发现,因为午夜分钟通过将这些结果60,或者只是通过楼(S / 60)%,(60 * 24)

 创建视图record_mins_from_midnight作为
选择 *,
地板(S / 60)%,(60 * 24)start_mins_fm,
地板((S +杜尔 -  1)/ 60)%,(60 * 24)end_mins_fm
从记录
;
 

现在我们创建分表。我们需要他们的1440,从0到1439在数据库中不支持任意顺序,我的造成人为的范围或序列的是这样的:

 创建表artificial_range(
   ID INT NOT NULL主键AUTO_INCREMENT,IDZ INT);
  插入artificial_range(IDZ)的值(0);
   - 重复下一行双排
  插入artificial_range(IDZ)从artificial_range选择工业开发区;
 

所以,创建一个表:

 创建视图分钟的
   选择ID  -  1为active_minute
   从artificial_range
   其中id< = 1440
   ;
 

现在我们只是加入我们的记录视图

 创建视图record_active_minutes作为
SELECT * FROM分钟
加入record_mins_from_midnight b
在(a.active_minute> = b.start_mins_fm
和a.active_minute< = b.end_mins_fm
 ;
 

这只是跨产品/乘法记录行,所以我们有一个记录行每个整分钟超过该呼叫活跃。

请注意,我定义活跃(的一部分)一分钟内发生的召唤这样做。即,在十二时00分59秒由该定义在两个不同分钟发生时开始,并结束于12时零一分01秒两第二呼叫,但是,在12点00分58秒开始,结束于12两秒钟呼叫:一分钟内发生00:59

我这样做是因为你指定所以,我需要一种方法来检查活动呼叫的次数为7:00-7:01,7:​​1月7日:02。如果preFER只考虑调用持久六十余秒,发生在一分钟以上,则需要调整加入。

现在,如果我们想找到的活跃记录的数目任何粒度等于或大于分钟的间隔,我们只是一群上最后一个视图。要了解每小时的平均呼叫我们除以60转几分钟到几小时:

 选择地板(active_minute / 60)为小时,
 COUNT(*)/ 60作为avg_concurent_calls_per_minute_for_hour
 从record_active_minutes
 按楼层(active_minute / 60);
 

请注意是平均每小时的所有通话的,在所有天;如果我们想将其限制在一个特定的日子天或范围,我们会增加一个其中,条款。

别急,还有更精彩的!

如果我们创建一个版本 record_active_minutes 中,做一个左外连接,我们可以得到一个报告,显示了平均值在当天的所有时间:

 创建视图record_active_minutes_all作为
 选择 *
 从
 分钟
 左外连接record_mins_from_midnight b
   在(a.active_minute> = b.start_mins_fm
       和a.active_minute< = b.end_mins_fm)
 ;
 

然后,我们再这样做了选择,而是针对新的观点:

 选择地板(active_minute / 60)为小时,
 COUNT(*)/ 60作为avg_concurent_calls_per_min
 从record_active_minutes_all
 按楼层(active_minute / 60);


+ ------ + ------------------------------ +
|小时| avg_concurrent_calls_per_min |
+ ------ + ------------------------------ +
| 0 | 0.0000 |
| 1 | 0.0000 |
| 2 | 0.0000 |
| 3 | 0.0000 |
   等等....
 

我们也可以索引到这一点的,在哪里。不幸的是,连接方式,我们将有空值的基本记录表,其中的某一个小时没有呼叫存在,例如,

 选择地板(active_minute / 60)为小时,
 COUNT(*)/ 60作为avg_concurent_calls_per_min
 从record_active_minutes_all
 其中,月(日期)= 1年(日期)= 2008
 按楼层(active_minute / 60);
 

将带回没有行了几个小时中,没有呼叫发生。如果我们仍然希望我们的报告类视图,显示所有的时间,我们要确保我们也包括那些时间没有记录的:

 选择地板(active_minute / 60)为小时,
 COUNT(*)/ 60作为avg_concurent_calls_per_minute_for_hour
 从record_active_minutes_all
 其中,(月(日)= 1年(日期)= 2008)
 或日期为null
 按楼层(active_minute / 60);
 

请注意,在最后两个例子中,我使用的是SQL日期(以该功能可应用),而不是CHAR(4)日起在记录表。

这带来了另一个问题:这两个日期和时间在你的记录表是多余的和非规范化,因为每个人都可以从你的列S得到。使他们在表允许的不一致行的可能性,其中日期(S)<>日期时间(s)<>时间。我想preFER做到这一点是这样的:

 创建表的记录(ID INT NOT NULL主键,S,持续时间);

   创建视图record_date作为
   选择*,DATEADD(SS,S,'1970-01-01')作为call_date
   从记录
  ;
 

DATEADD 功能, SS 是枚举类型,它告诉函数添加秒; 取值是记录列。

I have a set of call detail records, and from those records, I'm supposed to determine the average concurrent active calls per system, per hour (at a precision of one minute). If I query 7pm to 8pm, I should see the average concurrent calls for the hour (averaging the concurrent calls for each minute) within that hour (for each system).

So, I need a way to check for a count of active calls for 7:00-7:01, 7:01-7:02, etc then average those numbers. A call is considered active if the call's time and duration fall within the current minute being checked.

What makes this even more difficult is that it needs to span SQL 7.0 and SQL 2000 (some functions in 2000 aren't available in 7.0, such as GetUTCTime()), if I can just get 2000 working I'll be happy.

What approaches to this problem can I take?

I thought about looping through minutes (60) in the hour being checked and adding the count of calls that fall between that minute and then somehow cross referencing the duration to make sure that a call that starts at 7:00 pm and has a duration of 300 seconds shows active at 7:04, but I can't imagine how to approach the problem. I tried to figure out a way to weight each call against particular minute that would tell me if the call was active during that minute or not, but couldn't come up with an effective solution.

The data types here are the same as I have to query against. I don't have any control over the schema (other than possibly converting the data and inserting into another table with more appropriate data types). I've provided some example data that I know has concurrent active calls.

CREATE TABLE Records(
  seconds char(10),
  time char(4),
  date char(8),
  dur int,
  system int,
  port int,
)

--seconds is an stime value. It's the difference of seconds from UTC 1/1/1970 00:00:00 to the current UTC time, we use it as an identifier (like epoch).
--time is the time the call was made.
--date is the day the call was made.
--dur is the duration of the call in seconds.
--system is the system number.
--port is the port on the system (not particularly relevant for this question).

INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924228','1923','20090416',105,2,2)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239923455','1910','20090416',884,1,97)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924221','1923','20090416',116,2,15)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924259','1924','20090416',90,1,102)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239923458','1910','20090416',891,2,1)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924255','1924','20090416',99,2,42)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924336','1925','20090416',20,2,58)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924293','1924','20090416',64,2,41)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239923472','1911','20090416',888,2,27)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924347','1925','20090416',25,1,100)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924301','1925','20090416',77,2,55)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924332','1925','20090416',52,2,43)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924240','1924','20090416',151,1,17)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924313','1925','20090416',96,2,62)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924094','1921','20090416',315,2,16)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239923643','1914','20090416',788,2,34)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924447','1927','20090416',6,2,27)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924342','1925','20090416',119,2,15)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924397','1926','20090416',76,2,41)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924457','1927','20090416',23,2,27)

解决方案

I think MarkusQ has the answer, but let me develop an alternative that you may find easier to use. I'll use my customary method of developing this as a series of simple transformations in views, an analogue of functional decomposition in a procedural language.

First, let's put everything in common units. Recall that record's column s is seconds since the epoch, midnight 1 January 1970. We can find the number of seconds since midnight of the day of the call, that call occurred, by just taking s modulus the number of seconds in a day: s % (60 * 60 * 24).

select *, 
s % (60 * 60 * 24) as start_secs_from_midnight,
s % (60 * 60 * 24) + dur - 1 as end_secs_from_midnight,
;

We subtract one from s + dur because a one second call that starts at 12:00:00 also ends on 12:00:00.

We can find minutes since midnight by dividing those results by 60, or just by floor( s / 60 ) % (60 * 24) :

create view record_mins_from_midnight as
select *, 
floor( s / 60 ) % (60 * 24) as start_mins_fm,
floor( ( s + dur - 1) / 60 ) % (60 * 24) as end_mins_fm 
from record
;

Now we create a table of minutes. We need 1440 of them, numbered from 0 to 1439. In databases that don't support arbitrary sequences, I create an artificial range or sequence like this:

  create table artificial_range ( 
   id int not null primary key auto_increment, idz int) ;
  insert into artificial_range(idz) values (0);
  -- repeat next line to double rows
  insert into artificial_range(idz) select idz from artificial_range;

So to create a minute table:

  create view minute as 
   select id - 1 as active_minute 
   from artificial_range 
   where id <= 1440
   ;

Now we just join minute to our record view

create view record_active_minutes as
select * from minutes a 
join record_mins_from_midnight b
on (a.active_minute >= b.start_mins_fm 
and a.active_minute <= b.end_mins_fm 
 ;

This just cross products/multiplies record rows, so we have one record row for each whole minute over which the call was active.

Note that I'm doing this by defining active as "(part of) the call occurred during a minute". That is, a two second call that starts at 12:00:59 and ends at 12:01:01 by this definition occurs during two different minutes, but a two second call that starts at 12:00:58 and ends at 12:00:59 occurs during one minute.

I did that because you specified "So, I need a way to check for a count of active calls for 7:00-7:01, 7:01-7:02". If you prefer to consider only calls lasting more than sixty seconds to occur in more than one minute, you'll need to adjust the join.

Now if we want to find the number of active records for any granularity equal to or larger than minute granularity, we just group on that last view. To find average calls per hour we divide by 60 to turn minutes to hours:

 select floor( active_minute / 60 ) as hour, 
 count(*) / 60 as avg_concurent_calls_per_minute_for_hour
 from record_active_minutes
 group by floor( active_minute / 60 ) ;

Note that that is the average per hour for all calls, over all days; if we want to limit it to a particular day or range of days, we'd add a where clause.

But wait, there's more!

If we create a version of record_active_minutes that does a left outer join, we can get a report that shows the average over all hours in the day:

 create view record_active_minutes_all as
 select * 
 from 
 minutes a 
 left outer join record_mins_from_midnight b
   on (a.active_minute >= b.start_mins_fm 
       and a.active_minute <= b.end_mins_fm) 
 ;

Then we again do our select, but against the new view:

 select floor( active_minute / 60 ) as hour, 
 count(*) / 60 as avg_concurent_calls_per_min
 from record_active_minutes_all
 group by floor( active_minute / 60 ) ;


+------+------------------------------+
| hour | avg_concurrent_calls_per_min |
+------+------------------------------+
|    0 |                       0.0000 |
|    1 |                       0.0000 |
|    2 |                       0.0000 |
|    3 |                       0.0000 |
   etc....

We can also index into this with a where. Unfortunately, the join means we'll have null values for the underlying record table where no calls exist for a particular hour, e.g.,

 select floor( active_minute / 60 ) as hour, 
 count(*) / 60 as avg_concurent_calls_per_min
 from record_active_minutes_all
 where month(date) = 1 and year(date) = 2008 
 group by floor( active_minute / 60 ) ;

will bring back no rows for hours in which no calls occurred. If we still want our "report-like" view that shows all hours, we make sure we also include those hours with no records:

 select floor( active_minute / 60 ) as hour, 
 count(*) / 60 as avg_concurent_calls_per_minute_for_hour
 from record_active_minutes_all
 where (month(date) = 1 and year(date) = 2008) 
 or date is null 
 group by floor( active_minute / 60 ) ;

Note that in the last two examples, I'm using a SQL date (to which the functions month and year can be applied), not the char(4) date in your record table.

Which brings up another point: both the date and time in your record table are superfluous and denormalized, as each can be derived from your column s. Leaving them in the table allows the possibility of inconsistent rows, in which date(s) <> date or time(s) <> time. I'd prefer to do it like this:

   create table record ( id int not null primary key, s, duration) ; 

   create view record_date as 
   select *, dateadd( ss, s, '1970-01-01') as call_date
   from record
  ;

In the dateadd function, the ss is an enumerated type that tells the function to add seconds; s is the column in record.