Pandalar: agar chap ustun o'ng ustunlar har qanday bo'lsa, birlashtiriladi

Chap ma'lumotlar diapazonidagi ustunlardan biri o'ng ma'lumotlar doirasining ustunlaridan biriga mos keladigan bo'lsa, birlashtirilishi uchun ikki usul mavjud:

SELECT
  t1.*, t2.*
FROM
  t1
JOIN
  t2 ON t1.c1 = t2.c1 OR 
        t1.c1 = t2.c2 OR 
        t1.c1 = t2.c3 OR 
        t1.c1 = t2.c4

Python (something like):

import pandas as pd

dataA = [(1), (2)]

pdA = pd.DataFrame(dataA)
pdA.columns = ['col']

dataB = [(1, None), (None, 2), (1, 2)]

pdB = pd.DataFrame(dataB)
pdB.columns = ['col1', 'col2']

pdA.merge(pdB, left_on='col', right_on='col1') \
    .append(pdA.merge(pdB, left_on='col', right_on='col2'))

enter image description here enter image description here enter image description here

3
@PaulH Aslida, ignore_index = rost va .drop_duplicates() ilovasini qo'lda kolon qiymatining har ikkala o'ng ustun qiymatiga mos keladigan ikki nusxadagi satrlardan qutilish uchun foydalaniladi.
qo'shib qo'ydi muallif Denis Kulagin, manba
@PaulH Aslida, ignore_index = rost va .drop_duplicates() ilovasini qo'lda kolon qiymatining har ikkala o'ng ustun qiymatiga mos keladigan ikki nusxadagi satrlardan qutilish uchun foydalaniladi.
qo'shib qo'ydi muallif Denis Kulagin, manba
@PaulH Aslida, ignore_index = rost va .drop_duplicates() ilovasini qo'lda kolon qiymatining har ikkala o'ng ustun qiymatiga mos keladigan ikki nusxadagi satrlardan qutilish uchun foydalaniladi.
qo'shib qo'ydi muallif Denis Kulagin, manba
@PaulH Men doimo Pandas-fu usulida ishlayapmanmi yoki g'ildirakni qayta kashf etmoqchiman deb o'ylayman.
qo'shib qo'ydi muallif Denis Kulagin, manba
@PaulH Men doimo Pandas-fu usulida ishlayapmanmi yoki g'ildirakni qayta kashf etmoqchiman deb o'ylayman.
qo'shib qo'ydi muallif Denis Kulagin, manba
@PaulH Men doimo Pandas-fu usulida ishlayapmanmi yoki g'ildirakni qayta kashf etmoqchiman deb o'ylayman.
qo'shib qo'ydi muallif Denis Kulagin, manba
Uchinchi dataframe siz istagan narsa emas deb o'ylayman. Siz xohlagan ma'lumotlardan foydalanishni xohlaysizmi?
qo'shib qo'ydi muallif Paul H, manba
Uchinchi dataframe siz istagan narsa emas deb o'ylayman. Siz xohlagan ma'lumotlardan foydalanishni xohlaysizmi?
qo'shib qo'ydi muallif Paul H, manba
Uchinchi dataframe siz istagan narsa emas deb o'ylayman. Siz xohlagan ma'lumotlardan foydalanishni xohlaysizmi?
qo'shib qo'ydi muallif Paul H, manba
Xo'sh, nima savol? Sizning javobingiz bor kabi ko'rinadi.
qo'shib qo'ydi muallif Paul H, manba
Xo'sh, nima savol? Sizning javobingiz bor kabi ko'rinadi.
qo'shib qo'ydi muallif Paul H, manba
Xo'sh, nima savol? Sizning javobingiz bor kabi ko'rinadi.
qo'shib qo'ydi muallif Paul H, manba
men uchun yaxshi ko'rinadi
qo'shib qo'ydi muallif Paul H, manba
men uchun yaxshi ko'rinadi
qo'shib qo'ydi muallif Paul H, manba
men uchun yaxshi ko'rinadi
qo'shib qo'ydi muallif Paul H, manba

6 javoblar

Shunga o'xshab, qator isin tekshiruvi orqali tekshiramiz. O'rnatilgan mantiqdan foydalanib, yordam berish uchun numpy eshittirishdan foydalanishni yaxshi ko'raman.

f = lambda x: set(x.dropna())
npB = pdB.apply(f, 1).values
npA = pdA.apply(f, 1).values

a = npA <= npB[:, None]
m, n = a.shape

rA = np.tile(np.arange(n), m)
rB = np.repeat(np.arange(m), n)

a_ = a.ravel()

pd.DataFrame(
    np.hstack([pdA.values[rA[a_]], pdB.values[rB[a_]]]),
    columns=pdA.columns.tolist() + pdB.columns.tolist()
)

   col  col1  col2
0  1.0   1.0   NaN
1  2.0   NaN   2.0
2  1.0   1.0   2.0
3  2.0   1.0   2.0
0
qo'shib qo'ydi

Shunga o'xshab, qator isin tekshiruvi orqali tekshiramiz. O'rnatilgan mantiqdan foydalanib, yordam berish uchun numpy eshittirishdan foydalanishni yaxshi ko'raman.

f = lambda x: set(x.dropna())
npB = pdB.apply(f, 1).values
npA = pdA.apply(f, 1).values

a = npA <= npB[:, None]
m, n = a.shape

rA = np.tile(np.arange(n), m)
rB = np.repeat(np.arange(m), n)

a_ = a.ravel()

pd.DataFrame(
    np.hstack([pdA.values[rA[a_]], pdB.values[rB[a_]]]),
    columns=pdA.columns.tolist() + pdB.columns.tolist()
)

   col  col1  col2
0  1.0   1.0   NaN
1  2.0   NaN   2.0
2  1.0   1.0   2.0
3  2.0   1.0   2.0
0
qo'shib qo'ydi

Shunga o'xshab, qator isin tekshiruvi orqali tekshiramiz. O'rnatilgan mantiqdan foydalanib, yordam berish uchun numpy eshittirishdan foydalanishni yaxshi ko'raman.

f = lambda x: set(x.dropna())
npB = pdB.apply(f, 1).values
npA = pdA.apply(f, 1).values

a = npA <= npB[:, None]
m, n = a.shape

rA = np.tile(np.arange(n), m)
rB = np.repeat(np.arange(m), n)

a_ = a.ravel()

pd.DataFrame(
    np.hstack([pdA.values[rA[a_]], pdB.values[rB[a_]]]),
    columns=pdA.columns.tolist() + pdB.columns.tolist()
)

   col  col1  col2
0  1.0   1.0   NaN
1  2.0   NaN   2.0
2  1.0   1.0   2.0
3  2.0   1.0   2.0
0
qo'shib qo'ydi

Afsuski, buning uchun qurilgan usul bor, deb o'ylamayman. pandas birlashmalariga nisbatan ancha cheklangan, chunki siz asosan faqat chap ustun tengligi uchun o'ng sütunla sinov qilishingiz mumkin.

Bunga qaramay, o'zaro faoliyat mahsulot yaratib, keyin barcha tegishli shartlarni tekshirish mumkin. Natijada, ba'zi bir xotirani ishlatadi, lekin u juda samarasiz bo'lishi kerak.

Eslatma Test holatlaringizni biroz o'zgartirdim, ularni yanada umumiy qilish va o'zgaruvchan narsalarni birmuncha intuitivroq qilish uchun nomini o'zgartirish.

import pandas as pd
from functools import reduce

dataA = [1, 2]

dfA = pd.DataFrame(dataA)
dfA.columns = ['col']

dataB = [(1, None, 1), (None, 2, None), (1, 2, None)]

dfB = pd.DataFrame(dataB)
dfB.columns = ['col1', 'col2', 'col3']

print(dfA)
print(dfB)


def cross(left, right):
    """Returns the cross product of the two dataframes, keeping the index of the left"""

    # create dummy columns on the dataframes that will always match in the merge
    left["_"] = 0
    right["_"] = 0

    # merge, keeping the left index, and dropping the dummy column
    result = left.reset_index().merge(right, on="_").set_index("index").drop("_", axis=1)

    # drop the dummy columns from the mutated dataframes
    left.drop("_", axis=1, inplace=True)
    right.drop("_", axis=1, inplace=True)
    return result


def merge_left_in_right(left_df, right_df):
    """Return the join of the two dataframes where the element of the left dataframe's column
    is in one of the right dataframe's columns"""

    left_col, right_cols = left_df.columns[0], right_df.columns

    result = cross(left_df, right_df)    # form the cross product with a view to filtering it

    # a row must satisfy one of the following conditions:
    tests = (result[left_col] == result[right_col] for right_col in right_cols)

    # form the disjunction of the conditions
    left_in_right = reduce(lambda left_bools, right_bools: left_bools | right_bools, tests)

    # return the appropriate rows
    return result[left_in_right]


print(merge_left_in_right(dfA, dfB))
0
qo'shib qo'ydi

Afsuski, buning uchun qurilgan usul bor, deb o'ylamayman. pandas birlashmalariga nisbatan ancha cheklangan, chunki siz asosan faqat chap ustun tengligi uchun o'ng sütunla sinov qilishingiz mumkin.

Bunga qaramay, o'zaro faoliyat mahsulot yaratib, keyin barcha tegishli shartlarni tekshirish mumkin. Natijada, ba'zi bir xotirani ishlatadi, lekin u juda samarasiz bo'lishi kerak.

Eslatma Test holatlaringizni biroz o'zgartirdim, ularni yanada umumiy qilish va o'zgaruvchan narsalarni birmuncha intuitivroq qilish uchun nomini o'zgartirish.

import pandas as pd
from functools import reduce

dataA = [1, 2]

dfA = pd.DataFrame(dataA)
dfA.columns = ['col']

dataB = [(1, None, 1), (None, 2, None), (1, 2, None)]

dfB = pd.DataFrame(dataB)
dfB.columns = ['col1', 'col2', 'col3']

print(dfA)
print(dfB)


def cross(left, right):
    """Returns the cross product of the two dataframes, keeping the index of the left"""

    # create dummy columns on the dataframes that will always match in the merge
    left["_"] = 0
    right["_"] = 0

    # merge, keeping the left index, and dropping the dummy column
    result = left.reset_index().merge(right, on="_").set_index("index").drop("_", axis=1)

    # drop the dummy columns from the mutated dataframes
    left.drop("_", axis=1, inplace=True)
    right.drop("_", axis=1, inplace=True)
    return result


def merge_left_in_right(left_df, right_df):
    """Return the join of the two dataframes where the element of the left dataframe's column
    is in one of the right dataframe's columns"""

    left_col, right_cols = left_df.columns[0], right_df.columns

    result = cross(left_df, right_df)    # form the cross product with a view to filtering it

    # a row must satisfy one of the following conditions:
    tests = (result[left_col] == result[right_col] for right_col in right_cols)

    # form the disjunction of the conditions
    left_in_right = reduce(lambda left_bools, right_bools: left_bools | right_bools, tests)

    # return the appropriate rows
    return result[left_in_right]


print(merge_left_in_right(dfA, dfB))
0
qo'shib qo'ydi

Afsuski, buning uchun qurilgan usul bor, deb o'ylamayman. pandas birlashmalariga nisbatan ancha cheklangan, chunki siz asosan faqat chap ustun tengligi uchun o'ng sütunla sinov qilishingiz mumkin.

Bunga qaramay, o'zaro faoliyat mahsulot yaratib, keyin barcha tegishli shartlarni tekshirish mumkin. Natijada, ba'zi bir xotirani ishlatadi, lekin u juda samarasiz bo'lishi kerak.

Eslatma Test holatlaringizni biroz o'zgartirdim, ularni yanada umumiy qilish va o'zgaruvchan narsalarni birmuncha intuitivroq qilish uchun nomini o'zgartirish.

import pandas as pd
from functools import reduce

dataA = [1, 2]

dfA = pd.DataFrame(dataA)
dfA.columns = ['col']

dataB = [(1, None, 1), (None, 2, None), (1, 2, None)]

dfB = pd.DataFrame(dataB)
dfB.columns = ['col1', 'col2', 'col3']

print(dfA)
print(dfB)


def cross(left, right):
    """Returns the cross product of the two dataframes, keeping the index of the left"""

    # create dummy columns on the dataframes that will always match in the merge
    left["_"] = 0
    right["_"] = 0

    # merge, keeping the left index, and dropping the dummy column
    result = left.reset_index().merge(right, on="_").set_index("index").drop("_", axis=1)

    # drop the dummy columns from the mutated dataframes
    left.drop("_", axis=1, inplace=True)
    right.drop("_", axis=1, inplace=True)
    return result


def merge_left_in_right(left_df, right_df):
    """Return the join of the two dataframes where the element of the left dataframe's column
    is in one of the right dataframe's columns"""

    left_col, right_cols = left_df.columns[0], right_df.columns

    result = cross(left_df, right_df)    # form the cross product with a view to filtering it

    # a row must satisfy one of the following conditions:
    tests = (result[left_col] == result[right_col] for right_col in right_cols)

    # form the disjunction of the conditions
    left_in_right = reduce(lambda left_bools, right_bools: left_bools | right_bools, tests)

    # return the appropriate rows
    return result[left_in_right]


print(merge_left_in_right(dfA, dfB))
0
qo'shib qo'ydi
Python
Python
372 ishtirokchilar

Bu guruh python dasturlash tilini muhokama qilish uchun. Iltimos, o'zingizni hurmat qiling va faqat dasturlash bo'yicha yozing. Botlar mavzusini @botlarhaqida guruhida muhokama qling! FAQ: @PyFAQ Offtopic: @python_uz_offtopic

Python offtopic group !
Python offtopic group !
150 ishtirokchilar

@python_uz gruppasining offtop gruppasi. offtop bo'lsa ham reklama mumkin emas ) Boshqa dasturlash tiliga oid gruppalar @languages_programming