的Perl - 最新从输入变量成两个为链接变量、两个、链接、最新

2023-09-10 20:52:13 作者:成熟性猎杀

我是从一个.txt刮的目的加载数据。然而,URL需要我打破可变起来,做+/- 2到它。例如,如果该值是2342,我需要创建2340和2344的URL的目的。

我参加了一个猜测如何打破它:

  $ {ARGS} birth_year =($ {ARGS} birth_year  -  2)。 ' - '。 ($ ARGS {birth_year} + 2);
 

我怎么然后把它放在网址是什么?

下面是的code中的相关部分:

 使用严格的;
  使用警告;
  使用WWW ::机械化:: Firefox的;
  使用数据::自卸车;
  使用LWP :: UserAgent的;
   使用JSON;
  使用CGI QW /越狱/;
  使用HTML :: DOM;

  开放式(我$ L,'locations2.txt)或死无法打开地点:$!;

 而(我的$行=< $ L>){
    终日啃食$线;
     我的%ARGS;
     @args {QW /给定名称姓birth_place birth_year性别种族/} =分流/,/,$线;
     $ ARGS {birth_year} =($ {ARGS} birth_year  -  2)。 ' - '。 ($ ARGS {birth_year} + 2);
      我的$机甲= WWW ::机械化:: Firefox->新建(创建=→1,激活=> 1);
     $mech->get("https://familysearch.org/search/collection/index#count=20&query=%2Bgivenname%3A$args{givenname}20%2Bsurname%3A$args{surname}20%2Bbirth_place%3A$args{birth_place}%20%2Bbirth_year%3A1910-1914~%20%2Bgender%3A$args{gender}20%2Brace%3A$args{race}&collection_id=2000219");
 
风险决策引擎应用入门指南

例如

输入为:的

 本杰明,Schuvlein,德国,1912年,男,白
 

想要的网址为:的

https://familysearch.org/search/collection/index#count=20&query=%2Bgivenname%3ABenjamin%20%2Bsurname%3ASchuvlein%20%2Bbirth_place%3AGermany%20%2Bbirth_year%3A1910-1914~%20%2Bgender%3AM%20%2Brace%3AWhite&collection_id=2000219

解决方案

为什么你就不能改变这一行:

$mech->get("https://familysearch.org/search/collection/index#count=20&query=%2Bgivenname%3A$args{givenname}20%2Bsurname%3A$args{surname}20%2Bbirth_place%3A$args{birth_place}%20%2Bbirth_year%3A1910-1914~%20%2Bgender%3A$args{gender}20%2Brace%3A$args{race}&collection_id=2000219");

这样:

$mech->get("https://familysearch.org/search/collection/index#count=20&query=%2Bgivenname%3A$args{givenname}20%2Bsurname%3A$args{surname}20%2Bbirth_place%3A$args{birth_place}%20%2Bbirth_year%3A$args(birth_year)~%20%2Bgender%3A$args{gender}20%2Brace%3A$args{race}&collection_id=2000219");

注意:我改变了这一点:

 %3A1910-1914〜20%
 

这样:

 %3A $ ARG(birth_year)〜20%
 

I'm loading data from a .txt for the purposes of scraping. However, the URL requires that I break that variable up and do +/- 2 to it. For example, if the value is 2342, I need to create 2340 and 2344 for the purposes of the URL.

I took a guess at how to break it up:

 $args{birth_year} = ($args{birth_year} - 2) . '-' . ($args{birth_year} + 2);

How do I then put it in the URL?

Here's the relevant part of the code:

  use strict;
  use warnings;
  use WWW::Mechanize::Firefox;
  use Data::Dumper;
  use LWP::UserAgent;
   use JSON;
  use CGI qw/escape/;
  use HTML::DOM;

  open(my $l, 'locations2.txt') or die "Can't open locations: $!";

 while (my $line = <$l>) {
    chomp $line;
     my %args;
     @args{qw/givenname surname birth_place birth_year gender race/} = split /,/, $line;
     $args{birth_year} = ($args{birth_year} - 2) . '-' . ($args{birth_year} + 2);
      my $mech = WWW::Mechanize::Firefox->new(create => 1, activate => 1);
     $mech->get("https://familysearch.org/search/collection/index#count=20&query=%2Bgivenname%3A$args{givenname}20%2Bsurname%3A$args{surname}20%2Bbirth_place%3A$args{birth_place}%20%2Bbirth_year%3A1910-1914~%20%2Bgender%3A$args{gender}20%2Brace%3A$args{race}&collection_id=2000219");

For Example

Input is:

Benjamin,Schuvlein,Germany,1912,M,White

Desired URL is:

https://familysearch.org/search/collection/index#count=20&query=%2Bgivenname%3ABenjamin%20%2Bsurname%3ASchuvlein%20%2Bbirth_place%3AGermany%20%2Bbirth_year%3A1910-1914~%20%2Bgender%3AM%20%2Brace%3AWhite&collection_id=2000219

解决方案

Why can't you just change this line:

$mech->get("https://familysearch.org/search/collection/index#count=20&query=%2Bgivenname%3A$args{givenname}20%2Bsurname%3A$args{surname}20%2Bbirth_place%3A$args{birth_place}%20%2Bbirth_year%3A1910-1914~%20%2Bgender%3A$args{gender}20%2Brace%3A$args{race}&collection_id=2000219");

to this:

$mech->get("https://familysearch.org/search/collection/index#count=20&query=%2Bgivenname%3A$args{givenname}20%2Bsurname%3A$args{surname}20%2Bbirth_place%3A$args{birth_place}%20%2Bbirth_year%3A$args(birth_year)~%20%2Bgender%3A$args{gender}20%2Brace%3A$args{race}&collection_id=2000219");

NOTE: I changed this bit:

%3A1910-1914~%20

to this:

%3A$arg(birth_year)~%20