Zaɓi Harshe

Bayanan DICES: Bambancin a cikin Kimantawar Tsaro na AI na Tattaunawa

Gabatar da bayanan DICES don kimanta tsaron AI na tattaunawa tare da bambancin al'umma, cikakkun bayanan masu kimantawa, da ƙarfin ƙididdiga don nazarin bambanci da rashin fahimta.
agi-friend.com | PDF Size: 0.4 MB
Kima: 4.5/5
Kimarku
Kun riga kun ƙididdige wannan takarda
Murfin Takardar PDF - Bayanan DICES: Bambancin a cikin Kimantawar Tsaro na AI na Tattaunawa

1. Gabatarwa

Yaduwar tsarin AI na tattaunawa da aka gina akan Manyan Samfuran Harshe (LLMs) ya sa kimantawar tsaro ya zama abin damuwa mai mahimmanci. Hanyoyin gargajiya sau da yawa sun dogara da bayanai masu rabe-raben "tsaro" da "rashin tsaro" a sarari, wanda a zahiri yana sauƙaƙa yanayin tsaro mai ma'ana da al'adu. Bayanan DICES (Bambancin a cikin Kimantawar AI na Tattaunawa don Tsaro), waɗanda masu bincike daga Google Research, Jami'ar City na London, da Jami'ar Cambridge suka gabatar, sun magance wannan gibi ta hanyar samar da albarkatu wanda ke ɗaukar bambanci, rashin fahimta, da bambancin ra'ayoyin ɗan adam game da tsaron AI.

An ƙera DICES da ka'idoji uku na asali: 1) haɗa cikakkun bayanan al'umma game da masu kimantawa (misali, ƙabila, shekaru, jinsi), 2) yawan maimaita kimantawa a kowane abu na tattaunawa don tabbatar da ƙarfin ƙididdiga, da 3) ɓoyayyen kuri'un masu kimantawa a matsayin rarraba a cikin al'umma don ba da damar bincika dabarun haɗawa daban-daban. Wannan ƙira ta wuce "gaskiya guda ɗaya" kuma a maimakon haka tana ɗaukar tsaro a matsayin wani abu mai fuskoki da yawa, wanda ya dogara da yawan jama'a.

1.1. Gudunmawa

Babban gudunmawar bayanan DICES da binciken da ke tare da su sune:

  • Bambancin Masu Kimantawa a matsayin Siffa ta Cibiya: Matsar da hankali daga rage "son zuciya" zuwa karɓa da nazarin "bambancin" ra'ayoyin masu kimantawa.
  • Tsarin don Nazari mai zurfi: Samar da tsarin bayanai wanda ke ba da damar bincika yadda fahimtar tsaro ta haɗu da rukunin al'umma.
  • Ma'auni don Kimantawa mai zurfi: Kafa DICES a matsayin albarkatu na gama-gari don kimanta tsarin AI na tattaunawa ta hanyar da ke mutunta ra'ayoyi daban-daban, wucewa fiye da maki tsaro guda ɗaya.

2. Fahimtar Cibiya & Tsarin Ma'ana

Fahimtar Cibiya: Babban aibi a cikin babban kimantawar tsaron AI ba rashin bayanai bane, amma rashin bayanai wakilci da rarrabuwa. ɗaukar tsaro a matsayin aiki na haɗa kai, aikin rabe-raben binary ne mai haɗari wanda ke shafe ƙayyadaddun al'adu kuma zai iya haifar da tsarin da ke "tsaro" kawai ga al'umma mafi rinjaye. DICES ya gano daidai cewa tsaro wani abu ne na zamantakewa, kuma kimantawarsa dole ne ya zama na ƙididdiga, ba na ƙaddara ba.

Tsarin Ma'ana: Hujjar takardar tana da kaifi sosai: 1) LLM na yanzu na tsaro daidaitawa yana dogara da bayanai masu sauƙi. 2) Wannan sauƙaƙawa yana yin watsi da bambancin ma'ana, wanda ke da matsala musamman ga tsaro—ma'anar da ke cikin zamantakewa. 3) Don haka, muna buƙatar sabon nau'in bayanai wanda ke ɗaukar wannan bambanci a sarari ta hanyar bambancin al'umma da yawan maimaita masu kimantawa. 4) DICES yana ba da wannan, yana ba da damar nazarin da ke bayyana waɗanne ƙungiyoyi suka sami waɗanne abubuwa marasa tsaro da kuma har zuwa wane mataki. Wannan kwararar ta rushe tatsuniyar ma'aunin tsaro na duniya kuma ta maye gurbinsa da tsarin don fahimtar yanayin tsaro.

3. Ƙarfi & Kurakurai

Ƙarfi:

  • Ƙirar Canjin Tsari: Matsawa daga alamun binary zuwa rarraba al'umma shine siffarsa mai kashewa. Yana tilasta fagen fuskantar yawan tsaro.
  • Ƙaƙƙarfan Ƙididdiga: Yawan maimaitawa a kowane abu ba shi da yuwuwar tattaunawa don ingantaccen nazarin al'umma, kuma DICES ya sami wannan daidai. Yana ba da ƙarfin ƙididdiga da ake buƙata don wucewa fiye da labarai.
  • Za a iya aiwatarwa don Haɓaka Samfura: Ba kawai yana gano matsala ba; yana ba da tsari (rarraba) wanda zai iya ba da cikakken bayani kai tsaye game da ƙarin daidaitawa da ma'auni na kimantawa, kamar yadda ƙididdigar rashin tabbas ya inganta daidaitawar samfurin.

Kurakurai & Tambayoyi Budadde:

  • "Matsalar Al'umma": Duk da yana haɗa mahimman al'umma, zaɓin rukuni (kabila, shekaru, jinsi) shine farkon mataki. Ya rasa haɗin kai (misali, matan Baƙar fata matasa) da sauran maki kamar matsayin tattalin arziki, nakasa, ko yanayin al'adu, waɗanda su ma suna da mahimmanci don cikakken hoto.
  • Ƙalubalen Aiki: Takardar tana da sauƙi akan yadda. Ta yaya daidai mai haɓaka samfurin zai yi amfani da waɗannan rarrabuwa? Shin kuna daidaitawa zuwa ma'ana? Yanayin? Ko haɓaka tsarin da zai iya daidaita tacewar tsaronsa dangane da al'ummar mai amfani da aka ƙaddara? Mataki daga bayanai masu wadata zuwa aikin injiniya shine dutsen da za a hawa na gaba.
  • Hoton Tsaye: Ka'idojin al'umma akan tsaro suna haɓaka. Bayanai, ko da yaya bambancin, hoton tsaye ne. Tsarin ba shi da takamaiman hanya don ci gaba da sabunta waɗannan fahimtar tsaro, ƙalubalen da sauran bayanan ɗabi'a na tsaye suke fuskanta.

4. Hanyoyin Aiki masu Amfani

Ga masu aikin AI da shugabannin samfura:

  1. Bincike nan take: Yi amfani da tsarin DICES (rarraba, ba ma'ana ba) don bincika masu rarraba tsaron ku na yanzu. Da alama za ku sami sun dace da ɗan ƙaramin yanki na al'umma. Wannan haɗari ne na suna da samfur.
  2. Sake fassara Ma'aunin ku: Dakatar da bayar da rahoton "makin tsaro" guda ɗaya. Bayar da bayanin tsaro: "Abubuwan da wannan samfurin ya fitar sun dace da fahimtar tsaro na Ƙungiyar A tare da yarjejeniya X% kuma sun bambanta da Ƙungiyar B akan batutuwa Y da Z." Bayyana gaskiya yana gina amincewa.
  3. Saka hannun jari a cikin Tsaro mai Daidaitawa: Ƙarshen wasa ba samfurin da ya dace da tsaro ba ne, amma samfuran da za su iya fahimtar mahallin, gami da mahallin mai amfani. Saka hannun jari na bincike ya kamata ya juya daga tacewar tsaro guda ɗaya zuwa hanyoyin tsaro masu sanin mahallin da kuma yuwuwar keɓancewa ga mai amfani, tabbatar da cewa halayen samfurin ya dace da masu sauraron sa. Aikin kan daidaitawar ƙima a cikin ɗabi'ar AI, kamar yadda Cibiyar Stanford don AI mai Ma'ana ga Mutum (HAI) ta tattauna, ya jaddada cewa daidaitawa dole ne ya kasance tare da yawan ƙimar ɗan adam, ba saiti ɗaya ba.

5. Tsarin Fasaha & Ƙirar Bayanai

An gina bayanan DICES a kusa da tattaunawar ɗan adam da robot waɗanda manyan masu kimantawa, waɗanda aka raba su ta hanyar al'umma, suka kimanta su don tsaro. Babban ƙirƙira shine tsarin bayanai: maimakon adana alama guda ɗaya (misali, "maras tsaro"), kowane abu na tattaunawa yana da alaƙa da tsararrun kimantawa masu yawa waɗanda aka raba ta hanyar guga na al'umma.

Ga tattaunawa da aka ba da $c_i$, bayanan ba su ba da $label(c_i) \in \{0, 1\}$ ba. A maimakon haka, yana ba da saitin martanin masu kimantawa $R_i = \{r_{i,1}, r_{i,2}, ..., r_{i,N}\}$, inda kowane amsa $r_{i,j}$ ya zama tuple $(v_{i,j}, d_{i,j})$. Anan, $v_{i,j}$ shine hukuncin tsaro (misali, akan ma'aunin Likert ko binary), kuma $d_{i,j}$ vector ne da ke ɓoye halayen al'ummar mai kimantawa (misali, $d_{i,j} = [\text{jinsi}=G1, \text{shekaru}=A2, \text{kabila}=E3]$).

5.1. Wakilcin Lissafi na Rarraba Masu Kimantawa

Ƙarfin nazari na asali ya zo daga haɗa waɗannan ƙimar ɗaiɗaikun zuwa rarrabuwa. Don wani yanki na musamman na al'umma $D_k$ (misali, "Asiya, 30-39, Mace"), zamu iya ƙididdige rarraba makin tsaro don tattaunawa $c_i$:

$P(\text{maki} = s | c_i, D_k) = \frac{|\{r \in R_i : v(r)=s \land d(r) \in D_k\}|}{|\{r \in R_i : d(r) \in D_k\}|}$

Wannan yana ba da damar ƙididdiga ba kawai ma'anar makin tsaro $\mu_{i,k}$ ba, amma mafi mahimmanci, ma'auni na bambanci ($\sigma^2_{i,k}$), rashin fahimta (misali, entropy na rarraba $H(P)$), da rarrabuwa tsakanin ƙungiyoyin al'umma (misali, KL-divergence $D_{KL}(P_{i,k} || P_{i,l})$). Wannan ƙayyadaddun lissafi yana da mahimmanci don wucewa fiye da matsakaicin sauƙi.

6. Sakamakon Gwaji & Bayanin Ginshiƙi

Duk da yake abin da aka fitar na PDF wani rubutu ne da ake dubawa kuma bai ƙunshi cikakkun sakamakon gwaji ba, bayanan da aka bayyana suna ba da damar yin nazari da yawa waɗanda za a gabatar da su a cikin ginshiƙai:

  • Ginshiƙi 1: Taswirar Rashin Yardawar Al'umma: Hoto na matrix wanda ke nuna bambanci biyu (misali, nisan Jensen-Shannon) a cikin rarraba makin tsaro tsakanin ƙungiyoyin al'umma daban-daban (misali, Ƙungiyar A: Farin Namiji 50+ vs. Ƙungiyar B: 'Yar Hispanic 18-29) a cikin samfurin batutuwan tattaunawa masu kawo rigima. Wannan ginshiƙin zai haskaka inda fahimta ta fi bambanta sosai.
  • Ginshiƙi 2: Rashin Fahimta vs. Scatter Plot na Yarjejeniya:
  • Ginshiƙi 3: Rarrabuwar Aikin Samfurin Bar Chart: Kwatanta aikin (misali, makin F1) na daidaitaccen mai rarraba tsaro lokacin da aka kimanta shi da "gaskiyar ƙasa" da ƙungiyoyin al'umma daban-daban suka ayyana. Babban raguwar aiki ga wasu ƙungiyoyi zai nuna cewa daidaitawar samfurin ta karkata.

Ƙarfin DICES shine yana haifar da bayanan da ake buƙata don ƙirƙirar waɗannan ginshiƙai, yana matsar da kimantawa daga lamba ɗaya zuwa dashboard mai fuskoki da yawa.

7. Tsarin Nazari: Misalin Binciken Hali

Yanayi: AI na tattaunawa yana haifar da wasa a matsayin martani ga umarnin mai amfani. Bayanan horo da daidaitaccen kimantawar tsaro suna lakafta shi da "tsaro" (wasa).

Nazarin DICES:

  1. Dawar Bayanai: Yi tambaya ga bayanan DICES don abubuwan tattaunawa makamancin haka waɗanda suka haɗa da wasa ko barkwanci akan batutuwa masu alaƙa.
  2. Nazarin Rarraba: Bincika rarraba kimantawar tsaro. Kuna iya samun:
    • $P(\text{maras tsaro} | \text{shekaru}=18-29) = 0.15$
    • $P(\text{maras tsaro} | \text{shekaru}=60+) = 0.65$
    • $P(\text{maras tsaro} | \text{kabila}=E1) = 0.20$
    • $P(\text{maras tsaro} | \text{kabila}=E2) = 0.55$
  3. Fassara: "Tsaron" wannan wasan ba gaskiya bane amma aiki ne na al'umma. Abin da samfurin ya fitar, duk da yana bin ƙa'idar "tsaro" mai faɗi, yana ɗaukar haɗari mai yawa na ganin manya da membobin ƙabilar E2 suna cin mutunci.
  4. Aiki: Hanyar sauƙi ita ce a toshe duk barkwanci. Hanyar da ta dace, wacce DICES ta ba da labari, zata iya zama: a) Alamar wannan nau'in abun ciki a matsayin "babban bambancin al'umma," b) Haɓaka kayan aikin mahallin mai amfani wanda zai ba da damar samfurin ya daidaita salon wasansa, ko c) Bayar da bayanin bayyananne: "Wannan amsa tana amfani da wasa. Fahimtar wasa ta bambanta sosai a cikin al'adu da ƙungiyoyin shekaru."

Wannan binciken hali yana kwatanta yadda DICES ke matsar da tambaya daga "Shin wannan yana da tsaro?" zuwa "Tsaro ga wa, kuma a ƙarƙashin waɗanne sharuɗɗan?"

8. Aikace-aikace na Gaba & Hanyoyin Bincike

Tsarin DICES yana buɗe hanyoyi masu mahimmanci da yawa don aikin gaba:

  • Samfuran Tsaro na Keɓaɓɓu & Masu Daidaitawa: Ƙarshen ma'ana ba tacewar tsaro guda ɗaya ba ne, amma samfuran da za su iya ƙaddara mahallin mai amfani da ya dace (tare da kariyar sirri da ta dace) kuma su daidaita ƙofofin tsaro ko dabarun samar da abun ciki daidai da haka. Wannan ya dace da babban yanayin a cikin ML zuwa keɓancewa, kamar yadda ake gani a cikin tsarin shawara.
  • Kimantawa mai Ƙarfi da Ci gaba: Haɓaka hanyoyin don ci gaba da sabunta bayanan fahimtar tsaro kamar DICES a kusa da ainihin-lokaci, ɗaukar ka'idojin zamantakewa masu haɓakawa da rigingimu masu tasowa, kama da yadda samfuran harshe da kansu ake ci gaba da sabunta su.
  • Kayan Aikin Nazarin Haɗin Kai: Ƙaddamar da tsarin al'umma don ɗaukar ainihin asalin haɗin kai mafi kyau, matsawa fiye da rukuni masu zaman kansu don fahimtar haɗaɗɗun abubuwan da mutane ke da su na ƙungiyoyin tsiraru da yawa.
  • Haɗin kai tare da Horar da Ƙarfafawa daga Feedback na ɗan Adam (RLHF): Yin amfani da raba ra'ayin ɗan adam daga bayanai kamar DICES don horar da samfuran lada waɗanda ke da hankali ga daidaitawar al'umma, hana haɓakawa don ra'ayi ɗaya, mai yuwuwar kunkuntar, na "kyau" ko tattaunawa "mai tsaro". Wannan yana magance iyakance da aka sani a cikin daidaitaccen RLHF, kamar yadda bincike daga Anthropic da DeepMind ya nuna akan kulawa mai girma.
  • Faɗaɗa Duniya: Haɓaka tattara bayanai zuwa matakin duniya na gaske, wanda ya ƙunshi al'adu da harsuna marasa Yammacin Turai, don yaƙar son zuciya na Anglo da ke yaɗuwa a cikin albarkatun tsaro na AI da yawa.

9. Nassoshi

  1. Aroyo, L., Taylor, A. S., Díaz, M., Homan, C. M., Parrish, A., Serapio-García, G., Prabhakaran, V., & Wang, D. (2023). Bayanan DICES: Bambancin a cikin Kimantawar AI na Tattaunawa don Tsaro. arXiv preprint arXiv:2306.11247.
  2. Bommasani, R., et al. (2021). Akan Damammaki da Kasada na Samfuran Tushe. Cibiyar Stanford don Bincike akan Samfuran Tushe (CRFM).
  3. Gehman, S., Gururangan, S., Sap, M., Choi, Y., & Smith, N. A. (2020). RealToxicityPrompts: Kimanta lalacewar guba ta jijiyoyi a cikin Samfuran Harshe. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
  4. Ouyang, L., et al. (2022). Horar da samfuran harshe don bin umarni tare da feedback na ɗan adam. Haɓaka Tsarin Bayanai na Jijiyoyi (NeurIPS).
  5. Cibiyar Stanford don AI mai Ma'ana ga Mutum (HAI). (2023). Rahoton Index na AI na 2023. Jami'ar Stanford.
  6. Weidinger, L., et al. (2021) Hadurran ɗabi'a da zamantakewa na cutarwa daga samfuran harshe. arXiv preprint arXiv:2112.04359.
  7. Zhu, J.-Y., Park, T., Isola, P., & Efros, A. A. (2017). Fassarar Hoton-da-Hoto mara haɗin gwiwa ta amfani da Cibiyoyin Adawa na Ci gaba da Ci gaba. Proceedings of the IEEE International Conference on Computer Vision (ICCV). (An ambata a matsayin misali na tsari—CycleGAN—wanda ke sarrafa bayanai marasa haɗin gwiwa, nau'i-nau'i, kama da DICES yana sarrafa ra'ayoyin ɗan adam daban-daban, marasa daidaito).