Qaysi biri str_pos va preg_match orasida yanada samarali?

After this question: Pattern for check single occurrency into preg_match_all

Men tushunchamda aylanish jarayonida faqat bitta so'z bo'lishi kerakligini tushunaman, chunki bu savolda hisobotda "microsoft" va "microsoft exchange" ni topishim kerak va regexp-ni o'zgartira olmayman, chunki bu ikki imkoniyat dinamika bo'yicha berilgan ma'lumotlar bazasi!

Shunday ekan, mening savolim: 200 dan ortiq preg_match va kattaroq str_pos soni o'rtasidagi farqni tekshirish uchun bu qanday so'zlarni o'z ichiga oladi?

Ikkala yechim uchun mumkin bo'lgan kodni yozishga harakat qilaman:

$array= array(200+ values);
foreach ($array as $word)
{
    $pattern='<\b(?:'.$word.')\b>i';
    preg_match_all($pattern, $text, $matches);
    $fields['skill'][] = $matches[0][0];
}

muqobil:

$array= array(200+ values);
foreach ($array as $word)
{
    if(str_pos($word, $text)>-1)
    {
    fields['skill'][] = $word;
    }
}
0
REGEX asosidagi funktsiyalar ko'plab boshqa mag'lubiyat funktsiyalaridan sekinroq. Aytganingizdek, agar siz buni $ pattern = '<\ b (?:'. $ Word1. '|'. $ Word2. '|'. $ Word3 kabi qilsangiz, buni bir regeks bilan ham amalga oshirishi mumkin. ' '. $ word4.') & zwnj; \ b> i '; va bir vaqtning o'zida ishlatilishi mumkin bo'lgan qancha so'zlar regexning qancha vaqtgacha bo'lishiga bog'liq. Men 12004 chars uzunligi bo'lgan test regex-da yaratdim. Maks.
qo'shib qo'ydi muallif JustOnUnderMillions, manba
REGEX asosidagi funktsiyalar ko'plab boshqa mag'lubiyat funktsiyalaridan sekinroq. Aytganingizdek, agar siz buni $ pattern = '<\ b (?:'. $ Word1. '|'. $ Word2. '|'. $ Word3 kabi qilsangiz, buni bir regeks bilan ham amalga oshirishi mumkin. ' '. $ word4.') & zwnj; \ b> i '; va bir vaqtning o'zida ishlatilishi mumkin bo'lgan qancha so'zlar regexning qancha vaqtgacha bo'lishiga bog'liq. Men 12004 chars uzunligi bo'lgan test regex-da yaratdim. Maks.
qo'shib qo'ydi muallif JustOnUnderMillions, manba
str_pos() odatda preg_match'ga nisbatan 3-20x tezroq, chunki preg_match asosan mag'lubiyatning formatini tekshirish uchun ishlatiladi va muntazam ifodalarga asoslangan holda bo'limlarini olish uchun ishlatiladi.
qo'shib qo'ydi muallif Ross Keddy, manba
str_pos() odatda preg_match'ga nisbatan 3-20x tezroq, chunki preg_match asosan mag'lubiyatning formatini tekshirish uchun ishlatiladi va muntazam ifodalarga asoslangan holda bo'limlarini olish uchun ishlatiladi.
qo'shib qo'ydi muallif Ross Keddy, manba
str_pos() odatda preg_match'ga nisbatan 3-20x tezroq, chunki preg_match asosan mag'lubiyatning formatini tekshirish uchun ishlatiladi va muntazam ifodalarga asoslangan holda bo'limlarini olish uchun ishlatiladi.
qo'shib qo'ydi muallif Ross Keddy, manba

6 javoblar

strpos is much more fast than preg_match, here is a benchmark:

$array = array();
for($i=0; $i<1000; $i++) $array[] = $i;
$nbloop = 10000;
$text = <<<$nbloop; $i++) {
    foreach ($array as $word) {
        $pattern='<\b(?:'.$word.')\b>i';
        if (preg_match_all($pattern, $text, $matches)) {
            $fields['skill'][] = $matches[0][0];
        }
    }
}
echo "Elapse regex: ", microtime(true)-$start,"\n";


$start = microtime(true);
for ($i=0; $i<$nbloop; $i++) {
    foreach ($array as $word) {
        if(strpos($word, $text)>-1) {
            $fields['skill'][] = $word;
        }
    }
}
echo "Elapse strpos: ", microtime(true)-$start,"\n";

Chiqish:

Elapse regex: 7.9924139976501
Elapse strpos: 0.62015008926392

Bu taxminan 13 marta tezroq.

1
qo'shib qo'ydi
Juda yaxshi javobingiz uchun tashakkur!
qo'shib qo'ydi muallif Filippo1980, manba

strpos is much more fast than preg_match, here is a benchmark:

$array = array();
for($i=0; $i<1000; $i++) $array[] = $i;
$nbloop = 10000;
$text = <<<$nbloop; $i++) {
    foreach ($array as $word) {
        $pattern='<\b(?:'.$word.')\b>i';
        if (preg_match_all($pattern, $text, $matches)) {
            $fields['skill'][] = $matches[0][0];
        }
    }
}
echo "Elapse regex: ", microtime(true)-$start,"\n";


$start = microtime(true);
for ($i=0; $i<$nbloop; $i++) {
    foreach ($array as $word) {
        if(strpos($word, $text)>-1) {
            $fields['skill'][] = $word;
        }
    }
}
echo "Elapse strpos: ", microtime(true)-$start,"\n";

Chiqish:

Elapse regex: 7.9924139976501
Elapse strpos: 0.62015008926392

Bu taxminan 13 marta tezroq.

1
qo'shib qo'ydi
Juda yaxshi javobingiz uchun tashakkur!
qo'shib qo'ydi muallif Filippo1980, manba

strpos is much more fast than preg_match, here is a benchmark:

$array = array();
for($i=0; $i<1000; $i++) $array[] = $i;
$nbloop = 10000;
$text = <<<$nbloop; $i++) {
    foreach ($array as $word) {
        $pattern='<\b(?:'.$word.')\b>i';
        if (preg_match_all($pattern, $text, $matches)) {
            $fields['skill'][] = $matches[0][0];
        }
    }
}
echo "Elapse regex: ", microtime(true)-$start,"\n";


$start = microtime(true);
for ($i=0; $i<$nbloop; $i++) {
    foreach ($array as $word) {
        if(strpos($word, $text)>-1) {
            $fields['skill'][] = $word;
        }
    }
}
echo "Elapse strpos: ", microtime(true)-$start,"\n";

Chiqish:

Elapse regex: 7.9924139976501
Elapse strpos: 0.62015008926392

Bu taxminan 13 marta tezroq.

1
qo'shib qo'ydi
Juda yaxshi javobingiz uchun tashakkur!
qo'shib qo'ydi muallif Filippo1980, manba

REGEX asoslangan funktsiyalar ko'plab boshqa mag'lubiyat funktsiyalaridan sekinroq.

By the way your test can also do it with one regex if you do it like $pattern='<\b(?:'.$word1.'|'.$word2.'|'.$word3.'|'.$word4.')‌​\b>i'; and how many words you can use at once depends on how long the regex can be. I have created at test regex that was 12004 chars long. Seems not the max.

Regex versiyasi (bitta qo'ng'iroq):

$array= array(200+ values);

$pattern='<\b(?:'.implode('|',$array).')\b>i';
preg_match_all($pattern, $text, $matches);
//$fields['skill'][] = $matches[0][0]; 

strpos versiyasi (ko'p suhbat)

$array= array(200+ values);
foreach ($array as $word){
 if(strpos($word, $text)!==false)//not with >-1 wont work.
 {
   fields['skill'][] = $word;
 }
}

Bitta so'zni qidirsangiz, strpos HelloWorld Salom bilan mos keladi, shuning uchun siz faqat haqiqiy so'zlarni istasangiz, shunday qilishingiz mumkin:

$arrayOfWords = explode(' ',$string);
//and now you can check array aginst array 
$array= array(200+ values);
foreach ($array as $word){
 if(in_array($word,$arrayOfWords))//not with >-1 wont work.
 {
   fields['skill'][] = $word;
 }
}
//you can makes this also faster if you array_flip the arrayOfWords 
//and then check with 'isset' (more faster than 'in_array')

Agar siz so'zlaringiz ro'yxatidagi ushbu birikmalarga ega bo'lmasangiz, siz so'z birikmalariga ("microsoft exchange") mos keladigan tarzda shu tarzda bajarilmaydi.

* qo'shilgan izohlar

1
qo'shib qo'ydi
Sizning javobingiz uchun rahmat, biroq sizning regexp sizning aproblemingiz bor ... men aytganimdek, agar "Microsoft" va "microsoft exchange" ni bir xil ibora bilan qidirsam, sizning yechimingiz bitta natija topadi!
qo'shib qo'ydi muallif Filippo1980, manba
@ Filippo1980 OK, lekin tasdiqlangan javob, faqat "microsoft" emas, "microsoft almashinuvi" ni qidirsangiz, "i ga ishora qilgandim». Va mening regexp bir vaqtning o'zida bir nechta so'zni bilish uchun sizning regexp bilan bir xil; va sizning savolingiz siz xohlagan natijalar haqida emas, balki preformans haqida edi. -> Shuning uchun mening savolim: ...
qo'shib qo'ydi muallif JustOnUnderMillions, manba

REGEX asoslangan funktsiyalar ko'plab boshqa mag'lubiyat funktsiyalaridan sekinroq.

By the way your test can also do it with one regex if you do it like $pattern='<\b(?:'.$word1.'|'.$word2.'|'.$word3.'|'.$word4.')‌​\b>i'; and how many words you can use at once depends on how long the regex can be. I have created at test regex that was 12004 chars long. Seems not the max.

Regex versiyasi (bitta qo'ng'iroq):

$array= array(200+ values);

$pattern='<\b(?:'.implode('|',$array).')\b>i';
preg_match_all($pattern, $text, $matches);
//$fields['skill'][] = $matches[0][0]; 

strpos versiyasi (ko'p suhbat)

$array= array(200+ values);
foreach ($array as $word){
 if(strpos($word, $text)!==false)//not with >-1 wont work.
 {
   fields['skill'][] = $word;
 }
}

Bitta so'zni qidirsangiz, strpos HelloWorld Salom bilan mos keladi, shuning uchun siz faqat haqiqiy so'zlarni istasangiz, shunday qilishingiz mumkin:

$arrayOfWords = explode(' ',$string);
//and now you can check array aginst array 
$array= array(200+ values);
foreach ($array as $word){
 if(in_array($word,$arrayOfWords))//not with >-1 wont work.
 {
   fields['skill'][] = $word;
 }
}
//you can makes this also faster if you array_flip the arrayOfWords 
//and then check with 'isset' (more faster than 'in_array')

Agar siz so'zlaringiz ro'yxatidagi ushbu birikmalarga ega bo'lmasangiz, siz so'z birikmalariga ("microsoft exchange") mos keladigan tarzda shu tarzda bajarilmaydi.

* qo'shilgan izohlar

1
qo'shib qo'ydi
Sizning javobingiz uchun rahmat, biroq sizning regexp sizning aproblemingiz bor ... men aytganimdek, agar "Microsoft" va "microsoft exchange" ni bir xil ibora bilan qidirsam, sizning yechimingiz bitta natija topadi!
qo'shib qo'ydi muallif Filippo1980, manba
@ Filippo1980 OK, lekin tasdiqlangan javob, faqat "microsoft" emas, "microsoft almashinuvi" ni qidirsangiz, "i ga ishora qilgandim». Va mening regexp bir vaqtning o'zida bir nechta so'zni bilish uchun sizning regexp bilan bir xil; va sizning savolingiz siz xohlagan natijalar haqida emas, balki preformans haqida edi. -> Shuning uchun mening savolim: ...
qo'shib qo'ydi muallif JustOnUnderMillions, manba

REGEX asoslangan funktsiyalar ko'plab boshqa mag'lubiyat funktsiyalaridan sekinroq.

By the way your test can also do it with one regex if you do it like $pattern='<\b(?:'.$word1.'|'.$word2.'|'.$word3.'|'.$word4.')‌​\b>i'; and how many words you can use at once depends on how long the regex can be. I have created at test regex that was 12004 chars long. Seems not the max.

Regex versiyasi (bitta qo'ng'iroq):

$array= array(200+ values);

$pattern='<\b(?:'.implode('|',$array).')\b>i';
preg_match_all($pattern, $text, $matches);
//$fields['skill'][] = $matches[0][0]; 

strpos versiyasi (ko'p suhbat)

$array= array(200+ values);
foreach ($array as $word){
 if(strpos($word, $text)!==false)//not with >-1 wont work.
 {
   fields['skill'][] = $word;
 }
}

Bitta so'zni qidirsangiz, strpos HelloWorld Salom bilan mos keladi, shuning uchun siz faqat haqiqiy so'zlarni istasangiz, shunday qilishingiz mumkin:

$arrayOfWords = explode(' ',$string);
//and now you can check array aginst array 
$array= array(200+ values);
foreach ($array as $word){
 if(in_array($word,$arrayOfWords))//not with >-1 wont work.
 {
   fields['skill'][] = $word;
 }
}
//you can makes this also faster if you array_flip the arrayOfWords 
//and then check with 'isset' (more faster than 'in_array')

Agar siz so'zlaringiz ro'yxatidagi ushbu birikmalarga ega bo'lmasangiz, siz so'z birikmalariga ("microsoft exchange") mos keladigan tarzda shu tarzda bajarilmaydi.

* qo'shilgan izohlar

1
qo'shib qo'ydi
Sizning javobingiz uchun rahmat, biroq sizning regexp sizning aproblemingiz bor ... men aytganimdek, agar "Microsoft" va "microsoft exchange" ni bir xil ibora bilan qidirsam, sizning yechimingiz bitta natija topadi!
qo'shib qo'ydi muallif Filippo1980, manba
@ Filippo1980 OK, lekin tasdiqlangan javob, faqat "microsoft" emas, "microsoft almashinuvi" ni qidirsangiz, "i ga ishora qilgandim». Va mening regexp bir vaqtning o'zida bir nechta so'zni bilish uchun sizning regexp bilan bir xil; va sizning savolingiz siz xohlagan natijalar haqida emas, balki preformans haqida edi. -> Shuning uchun mening savolim: ...
qo'shib qo'ydi muallif JustOnUnderMillions, manba
PhP |BotsUz
PhP |BotsUz
93 ishtirokchilar

Phpni o'rganishni Hohlasangiz https://t.me/joinchat/AAAAAE-KRc5dd5tPMmGmWA A'zo bo'lin