Subdomain uchun Regexp

a-zA-Z0-9 .- (harflar, sonlar, fikrlar va chiziqlarni) hech qachon boshlamagan yoki tugatmaydigan but ga ruxsat beruvchi regexpni qanday yozishni biladimi? nuqta yoki chiziq bilan?

Men buni sinab ko'rdim:

/^[^.-][a-zA-Z0-9.-]+[^.-]$/

... lekin "john @" kabi bir narsa yozsam, u ishlaydi va men @ ruxsat berilmagani uchun istamayman.

20
Regex ta'mi qanday? (Perl, egrep, awk, vim, JavaScript ...)
qo'shib qo'ydi muallif Benoit, manba

7 javoblar

Subdomain

Tegishli internet tavsiyalariga muvofiq ( RFC3986-dars 2.2 , bu esa o'z navbatida uchun: RFC1034 qism 3.5 »va RFC1123 qism 2.1 ), subdomain (DNS domenining asosiy kompyuter nomining bir qismi bo'lgan), bir nechta talablarga javob berishi kerak:

  • Har bir subdomain qismi 63 dan katta bo'lmagan uzunlikka ega bo'lishi kerak.
  • Har bir subdomain qismi alfa-numerik (harflar [A-Za-z] yoki raqam [0-9] ) bilan boshlanadi va tugashi kerak).
  • Har bir subdomain qismida tire (dashlar) bo'lishi mumkin, biroq chiziq bilan boshlash yoki tugatish mumkin emas.

Quyida ushbu talablarga javob beradigan subdomain qismini ifodalovchi fragment mavjud:

[A-Za-z0-9](?:[A-Za-z0-9\-]{0,61}[A-Za-z0-9])?

Bu ifoda qismini yakka holda ishlatmaslik kerakligini unutmang - chegaradagi shartlarni yanada katta kontekstda joylashtirishni talab qiladi, chunki DNS host nomi uchun quyidagi ibora ko'rsatiladi ...

DNS host nomi

Nomlangan host (IP-manzil emas), qo'shimcha talablarga javob berishi kerak:

  • The host name may consist of multiple Subdomain parts, each separated by a single dot.
  • The length of the overall host name should not exceed 255 characters.
  • The top level domain, (the rightmost part of the DNS host nomi), must be one of the internationally recognized values. The list of valid top level domains is maintained by IANA.ORG. (See the bare-bones current list here: http://data.iana.org/TLD/tlds-alpha-by-domain.txt).

With this is mind, here a commented regex (in PHP syntax), which will pseudo-validate a DNS host nomi: (Note that this incorporates a modified version of the above expression for a Subdomain and adds comments to this as well).

Update 2016-08-20: Since this answer was originally posted back in 2011, the number of top-level domains has exploded. As of August 2016 there are now more than 1400. The original regex to this answer incorporated all of these but this is no loger practical. The new regex below incorporates a different expression for the top-level domain. The algorithm comes from: Top Level Domain Name Specification draft-liman-tld-names-06.

$DNS_named_host = '%(?#!php/i DNS_named_host Rev:20160820_0800)
    # Match DNS named host domain having one or more Subdomains.
    # See: http://stackoverflow.com/a/7933253/433790
    ^                     # Anchor to start of string.
    (?!.{256})            # Whole domain must be 255 or less.
    (?:                   # One or more sub-domains.
      [a-z0-9]            # Subdomain begins with alpha-num.
      (?:                 # Optionally more than one char.
        [a-z0-9-]{0,61}   # Middle part may have dashes.
        [a-z0-9]          # Starts and ends with alpha-num.
      )?                  # Subdomain length from 1 to 63.
      \.                  # Required dot separates Subdomains.
    )+                    # End one or more sub-domains.
    (?:                   # Top level domain (length from 1 to 63).
      [a-z]{1,63}         # Either traditional-tld-label = 1*63(ALPHA).
    | xn--[a-z0-9]{1,59}  # Or an idn-label = Restricted-A-Label.
    )                     # End top level domain.
    $                     # Anchor to end of string.
    %xi'; //End $DNS_named_host.

Ushbu ibora mukammal emasligini unutmang. Bu bitta yoki undan ko'p subdomainni talab qiladi, ammo texnik jihatdan, uy egasi subdomainga ega bo'lmagan bir TLDdan iborat bo'lishi mumkin (ammo bu kamdan-kam).

Update 2014-08-12: Added simplified expression for Subdomain which does not require alternation.

Update 2016-08-20: Modified DNS host nomi regex to (more generally) match the new vast number of valid top level domains. Also, trimmed out unnecessary material from answer.

64
qo'shib qo'ydi
@algorhythm - QRXlarning sharhlanishi, ikkilamchi chiziqcha juda yaxshi, ammo har bir subdomain qismi chiziqcha bilan boshlanmasligi yoki tugashi mumkin emas.
qo'shib qo'ydi muallif ridgerunner, manba
@Qqwy - Ha, siz mutlaqo to'g'ri. Men biroz vaqt olsam, javobni yangitdan yangilab olaman. Fikr uchun tashakkur!
qo'shib qo'ydi muallif ridgerunner, manba
Nihoyat, bu bir oz yaxshilash uchun bir oz vaqt topildi.
qo'shib qo'ydi muallif ridgerunner, manba
javob qabul qilinishi kerak. Men ko'rmagan narsam bormi?
qo'shib qo'ydi muallif Yusuf Uzun, manba
Bu yaxshi qo'pollik bilan tasdiqlangan, lekin 1. pastki satrlarni mukammal qonuniydir , shuning uchun ^ \ w (?: [\ W- ] Subdomain qismlari uchun {0,61} \ w) $ juda yaxshi ishlaydi, aslida srv yozuvlari ularni oddiy subdomainlar bilan to'qnashuvlar oldini olish uchun talab qiladi. fyi juftlik chiziqlari punycode uchun ishlaydigan . Siz, albatta, bunday tekshiruvlarni muayyan yozuv turlariga cheklab qo'yishingiz mumkin, biroq bu siz yoki hozirgi ro'yxatga qarshi nazorat qilish imkonini beradigan narsa uchun kichik parolni yozishingiz kerak bo'ladi :)
qo'shib qo'ydi muallif sg3s, manba
Hmm, o'ylaymanki, ikki tomonlama "-" ham to'g'ri emas, balki bu regex bilan mumkin, to'g'rimi?
qo'shib qo'ydi muallif algorhythm, manba
Shuni esda tutingki, 2016 yilda taqdim etilgan DNS hostname regex ruxsat berganidan ko'ra ko'proq ruxsat berilgan TLDlar mavjud.
qo'shib qo'ydi muallif Qqwy, manba
Ushbu javob uchun rahmat
qo'shib qo'ydi muallif swietyy, manba
rahmat, katta javob!
qo'shib qo'ydi muallif Pedro Emilio Borrego Rached, manba

Birinchi va oxirgi belgilar alfasayısal bilan cheklangan bo'lishini xohlaysiz. Sizning dastlabki va oxirgi harflaringiz nuqta va chiziqdan boshqa narsa bo'lishiga imkon beradi. Bu tavsifga mos keladi:

/^[a-zA-Z0-9][a-zA-Z0-9.-]+[a-zA-Z0-9]$/
8
qo'shib qo'ydi
test.subdomain..com da muvaffaqiyatsiz
qo'shib qo'ydi muallif Dinesh Patra, manba
Ehtimol, pastki chiziq (_) ham ruxsat berilishi kerak. Va kichik xabar: bu regexp /^ \ w [\ w .-] + \ w $/i ga soddalashtirilishi mumkin.
qo'shib qo'ydi muallif RReverser, manba
PHP uchun. Sizning yordamingiz uchun tashakkur, u ajoyib ishlaydi: [a-zA-Z0-9] [a-zA-Z0-9] - [a-zA-Z0-9]
qo'shib qo'ydi muallif user1018527, manba

Bizning loyihamizda bunday subdomendlarga o'xshashmiz

Client JS

^([A-Za-z0-9](?:(?:[-A-Za-z0-9]){0,61}[A-Za-z0-9])?(?:\.[A-Za-z0-9](?:(?:[-A-Za-z0-9]){0,61}[A-Za-z0-9])?){2,})$

Server Ruby

\A([A-Za-z0-9](?:(?:[-A-Za-z0-9]){0,61}[A-Za-z0-9])?(?:\.[A-Za-z0-9](?:(?:[-A-Za-z0-9]){0,61}[A-Za-z0-9])?){2,})\z
2
qo'shib qo'ydi

Boshqalarga yordam beradigan DOMAIN + SUBDOMAIN echimini toping:

   /^([a-zA-Z0-9]([-a-zA-Z0-9]{0,61}[a-zA-Z0-9])?\.)?([a-zA-Z0-9]{1,2}([-a-zA-Z0-9]{0,252}[a-zA-Z0-9])?)\.([a-zA-Z]{2,63})$/

Chay testlaridan so'ng o'tgan:

const expect = require('chai').expect;

function testDomainValidNamesRegExp(val) {
    let names = /^([a-zA-Z0-9]([-a-zA-Z0-9]{0,61}[a-zA-Z0-9])?\.)?([a-zA-Z0-9]([-a-zA-Z0-9]{0,252}[a-zA-Z0-9])?)\.([a-zA-Z]{2,63})$/;
    return names.test(val);
} 

let validDomainNames = [
    "example.com",
    "try.direct",
    "my-example.com",
    "subdomain.example.com",
    "example.com",
    "example23.com",
    "regexp-1222.org",
    "read-book.net",
    "org.host.org",
    "org.host.org",
    "velmart.shop-products.md",
    "ip2email.terronosp-222.lb",
    "stack.com",
    "sta-ck.com",
    "sta---ck.com",
    "9sta--ck.com",
    "sta--ck9.com",
    "stack99.com",
    "99stack.com",
    "sta99ck.com",
    "sub.do.com",
    "ss.sss-ss.ss",
    "s.sss-ss.ss",
    "s.s-s.ss",
    "test.t.te"
    ];

let invalidDomainNames = [
     "example2.com222",
     "@example.ru:?",
     "example22:89",
     "@[email protected]@22-",
     "example.net?1222",
     "example.com:8080:",
     ".example.com:8080:",
     "---test.com",
     "$dollars$.gb",
     "sell-.me",
     "[email protected]",
     "mem-.wer().or%:222",
     "pop().addjocker.lon",
     "regular-l=.heroes?",
     " ecmas cript-8.org ",
     "example.com::%",
     "example:8080",
     "example",
     "examaple.com:*",
    "-test.test.com",
    "-test.com",
    "dd-.test.com",
    "dfgdfg.dfgdf33.e",
    "dddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd-.test.com",
    "dddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd.testttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt.com",
    "d-.test.com"
];

describe("Test Domain Valid Names RegExp",() => {
    validDomainNames.forEach((val) => {
        it(`Text: ${val}`,() => {
            expect(testDomainValidNamesRegExp(val)).to.be.true;
        });
    });
});

describe("Test Domain Invalid Names RegExp",() => {
    invalidDomainNames.forEach((val) => {
        it(`Text: ${val}`,() => {
            expect(testDomainValidNamesRegExp(val)).to.be.false;
        });
    });
});

Boshqa testlar juda yoqimli!

1
qo'shib qo'ydi
kichik tuzatish va yana bir sinov bilan yangilangan
qo'shib qo'ydi muallif Vasili Pascal, manba

Buni ko'ring:

/^[a-zA-Z0-9][a-zA-Z0-9.-]*[a-zA-Z0-9]$/

Bunda mag'lubiyatga mos keladigan kamida 2 ta belgi bo'lishi kerak: a-zA-Z0-9 va a-zA-Z0-9. Bunga yo'l qo'ymaslik uchun ushbu regexdan foydalanishingiz mumkin:

/^[a-zA-Z0-9][a-zA-Z0-9.-]*$/

Lekin mag'lubiyatning oxirigacha nuqta ham, chiziq ham yo'qligini tekshirish uchun qo'shimcha tekshirish kerak.

1
qo'shib qo'ydi

Buni regex qilib ko'ring:

^(?![-.])[a-zA-Z0-9.-]+(?
0
qo'shib qo'ydi

Ushbu reg-exp /^ [a-zA-Z0-9] [a-zA-Z0-9 .-] * [a-zA-Z0-9] $/ Sizning kodingizdagi muammo [^ .-] boshlang'ich va nihoyasida har qanday harfni o'chirib tashlashda namoyon bo'ldi. ' yoki barcha belgilarga mos keladigan "-" belgisi bo'lishi shart emas, balki [a-zA-Z0-9]

0
qo'shib qo'ydi