Zaɓi Harshe

Nazarin Wizard of Oz don Bayanan Tattaunawa na Mataimakin Na'ura na API

Nazarin gwajin Wizard of Oz da ke kwaikwayon tattaunawar amfani da API don gina bayanan horo na mataimakan na'ura na injiniyan software.
agi-friend.com | PDF Size: 1.7 MB
Kima: 4.5/5
Kimarku
Kun riga kun ƙididdige wannan takarda
Murfin Takardar PDF - Nazarin Wizard of Oz don Bayanan Tattaunawa na Mataimakin Na'ura na API

1. Gabatarwa & Bayyani

Wannan takarda tana magance wata matsala mai mahimmanci a cikin haɓaka mataimakan na'ura na musamman don injiniyan software: rashin ingantattun bayanan tattaunawa masu dacewa da ayyuka. Yayin da mataimakan gabaɗaya (misali, Siri, Alexa) suka bunƙasa akan bayanai masu yawa da iri-iri, fannonin musamman kamar shirye-shiryen API suna fama da rashi bayanai. Marubutan sun gudanar da gwajin Wizard of Oz (WoZ), suna kwaikwayon mataimakin taimakon API wanda ƙwararrun mutane ke sarrafa su a ɓoye, don tattarawa da alamar tarin hulɗar masu shirye-shirye da mataimaka. Babbar gudummawar ba kawai bayanai ba ce, amma tsarin alamar da aka tsara wanda aka ƙera don fassara rikitattun dabarun tattaunawa da masu shirye-shirye ke amfani da su lokacin neman ilimin API.

2. Hanyoyin Bincike & Tsarin Gwaji

Binciken ya yi amfani da tsarin WoZ mai sarrafawa don haifar da tattaunawa ta halitta ba tare da ƙuntatawa na ƙirar AI mai rauni ba.

2.1. Ka'idar Wizard of Oz

An ɗauki masu shirye-shirye na ƙwararru 30 don kammala ayyukan shirye-shirye ta amfani da API guda biyu da ba a bayyana su ba. Sun yi hulɗa da abin da suka yi imanin cewa mataimakin AI ne. Ba tare da saninsu ba, "mataimakin" ƙwararren ɗan adam ne ("Wizard") yana amsawa cikin sauri ta hanyar mu'amalar hira. Wannan hanyar ta ƙetare matsalar farawa mai sanyi na AI, yana ba da damar tattara tattaunawa masu wadata, masu manufa waɗanda ke nuna ainihin bukatun mai amfani da tsarin tattaunawa.

2.2. Zaɓin Mahalarta & Ayyuka

Mahalarta masu haɓaka software ne masu aiki. An ƙera ayyukan don su zama masu mahimmanci, suna buƙatar binciken API mai mahimmanci da warware matsaloli, suna tabbatar da cewa tattaunawar ta ƙunshi nau'ikan tambayoyi da buƙatun bayanai daban-daban fiye da binciken tsari mai sauƙi.

3. Tsarin Alamar Bayanai

An yiwa ainihin tarin tattaunawa alama tare da ma'auni guda huɗu masu mahimmanci, suna ƙirƙirar hangen nesa mai fuskoki da yawa na kowace furci.

3.1. Ma'auni na Ayyukan Tattaunawa

  • Manufar Illocutionary: Manufar aiki (misali, nema, sanarwa, tabbatarwa).
  • Nau'in Bayanin API: Rukunin ilimin API da ake nema (misali, ra'ayi, aiki, siga, misali).
  • Aikin Komawa Baya: Yadda furcin ke da alaƙa da tattaunawar da ta gabata (misali, amsa, bayani, gyara).
  • Bin Sawu zuwa Abubuwan API: Taswirar tattaunawa zuwa takamaiman abubuwa a cikin takaddun API.

3.2. Tsarin Alama

Wannan tsari mai ma'auni da yawa ya wuce rarraba manufa mai sauƙi. Yana ɗaukar rikitarwar tsari da tunani na tattaunawar fasaha, yana ba da tsari don horar da samfuran da ba kawai su fahimci abin da ake tambaya ba, amma mahallin da tsarin ilimin tambayar.

4. Muhimman Sakamako & Fahimtar Ƙididdiga

Ma'aunin Mahalarta

30

Masu Shirye-shirye na Ƙwararru

API da aka Yi Amfani da su

2

API daban-daban don Ayyuka

Ma'auni na Alama

4

Sashe na Ayyukan Tattaunawa

Binciken ya haifar da tarin bayanai da ke nuna bambancin hulɗa. Binciken farko ya nuna cewa tambayoyin masu shirye-shirye sau da yawa sun haɗa da nau'ikan bayanai masu rikitarwa kuma suna buƙatar amsoshi masu juyawa da yawa, waɗanda suka dogara da mahallin. Ma'aunin bin sawu ya tabbatar da mahimmanci, yana nuna buƙatar mataimakan AI na gaba su haɗa kai sosai tare da yin tunani game da takaddun API da aka tsara, kamar yadda tsarin haɓaka samarwa (RAG) ke kafa amsoshi a cikin tushen ilimi na waje.

5. Nazarin Fasaha & Tsarin Lissafi

Za a iya tsara tsarin alama. Bari tattaunawa $D$ ta zama jerin furci $\{u_1, u_2, ..., u_n\}$. Ana yiwa kowace furci $u_i$ alama a matsayin vector: $$\mathbf{a}_i = [I_i, T_i, B_i, R_i]$$ inda:

  • $I_i$ ∈ $\mathcal{I}$: Manufar Illocutionary (ƙayyadaddun saiti na lakabi).
  • $T_i$ ∈ $\mathcal{P}(\mathcal{T})$: Saitin Nau'ikan Bayanin API (ƙarfin saitin lakabin nau'in).
  • $B_i$ ∈ $\mathcal{B}$: Lakabin aikin komawa baya.
  • $R_i$ ⊆ $\mathcal{C}$: Saitin abubuwan API masu bin sawu daga sanannen saiti $\mathcal{C}$.
Tarin tattaunawa $\mathcal{D}$ shine saitin duk tattaunawar da aka yiwa alama. Wannan wakilcin da aka tsara yana da mahimmanci don horar da samfuran koyon na'ura, musamman jerin-zuwa-jeri ko hanyoyin sadarwar jijiyoyi na zane, don hasashen amsoshin mataimaka masu dacewa $u_{i+1}$ idan aka ba da mahallin $\{\mathbf{a}_1, ..., \mathbf{a}_i\}$ da ainihin zanen ilimin API da $\mathcal{C}$ ya ayyana.

6. Tsarin Nazari: Misalin Nazarin Shari'a

Yanayin: Mai shirye-shirye yana ƙoƙarin tabbatar da mai amfani ta amfani da `OAuth2Library` amma ya ci karo da kuskure game da `scope` mara inganci.

Guntun Tattaunawa & Alama:

  • Mai Shirye-shirye: "Kiran `authenticate_user` yana gaza tare da 'scope mara inganci'. Wadanne scop ne masu inganci?"
    • Manufa: Nema.
    • Nau'in Bayani: Sigar/Ƙuntatawa, Ma'anar Kuskure.
    • Aikin Komawa Baya: Sabuwar Tambaya (kuskure ya haifar).
    • Bin Sawu: `OAuth2Library.authenticate_user`, siga `scope`.
  • Wizard/Mataimaki: "Scop masu inganci sune 'karanta', 'rubutu', da 'admin'. Kuskuren yana nufin kirtan da kuka wuce ba ɗaya daga cikin waɗannan ba. Shin kun duba abu `OAuth2Config`?"
    • Manufa: Sanarwa, Shawara.
    • Nau'in Bayani: Ƙimar Ƙididdiga, Jagorar Ra'ayi.
    • Aikin Komawa Baya: Amsa, Bayani.
    • Bin Sawu: takaddun sigar `scope`, ajin `OAuth2Config`.

Wannan misalin yana nuna tunanin da ake buƙata mai juyawa da yawa: daga saƙon kuskure, zuwa ƙimar siga masu inganci, zuwa abu mai alaƙa da tsari. Samfurin QA mai sauƙi zai gaza; samfurin da aka horar da wannan tarin bayanai da aka yiwa alama yana koyon wannan haɗin gwiwa.

7. Ayyukan Gaba & Hanyoyin Bincike

  • Kayan haɗin IDE na Musamman: Bayanan suna ba da kai tsaye ga cikar AI da tsarin amsa tambayoyi a cikin IDE waɗanda suka fahimci mahallin musamman na aikin, kama da juyin halitta na GitHub Copilot daga Codex amma tare da tushen API mai zurfi.
  • Haɓaka Takaddun Aikawa ta atomatik: Tsarin tattaunawa na iya gano gibi ko rashin fahimta a cikin takaddun API. Misali, tambayoyi akai-akai game da siga `X` yana nuna rashin ingantaccen takaddun aiki don `X`.
  • Gabaɗaya na API: Shin za a iya canja dabarun tattaunawa da aka koya don API ɗaya (misali, Java Streams) zuwa wani (misali, Python Pandas)? Wannan yana buƙatar koyon manufofin tattaunawa masu zaman kansu, waɗanda ba su da alaƙa da yanki.
  • Haɗin kai tare da LLMs & RAG: Wannan tarin bayanai da aka yiwa alama shine ma'auni mai kyau na horo da kimantawa don tsarin Haɓaka Samarwa a cikin yankin software, yana gwada ikonsu na dawo da abubuwan API daidai da samar da amsoshi masu tushe, masu taimako.
  • Taimako mai Ƙarfafawa: Bayan amsa tambayoyi, mataimakan gaba za su iya bincika mahallin lamba kuma su ba da shawarwarin API masu dacewa, wata hanya da kayan aiki kamar Amazon CodeWhisperer suka nuna.

8. Nassoshi

  1. McTear, M., Callejas, Z., & Griol, D. (2016). The Conversational Interface: Talking to Smart Devices. Springer.
  2. Serban, I. V., et al. (2015). A survey of available corpora for building data-driven dialogue systems. arXiv preprint arXiv:1512.05742.
  3. Rieser, V., & Lemon, O. (2011). Reinforcement Learning for Adaptive Dialogue Systems: A Data-driven Methodology for Dialogue Management and Natural Language Generation. Springer.
  4. Chen, M., et al. (2021). Evaluating Large Language Models Trained on Code. arXiv preprint arXiv:2107.03374. (Codex/Copilot)
  5. Lewis, P., et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS.
  6. OpenAI. (2023). GPT-4 Technical Report. arXiv preprint arXiv:2303.08774.
  7. Allamanis, M., et al. (2018). A survey of machine learning for big code and naturalness. ACM Computing Surveys.

9. Nazarin Kwarai na Asali

Babban Fahimta: Wannan takarda wani hari ne na tiyata kan matsalar abubuwan more rayuwa na AI-don-SE: bayanai. Marubutan sun gano daidai cewa ci gaban manyan samfuran harshe (LLMs) kamar GPT-4 ko Codex, don yankuna na musamman, suna hana su saboda rashin ingantattun bayanan tattaunawa, waɗanda aka tsara, waɗanda suka dace da ayyuka. Aikin su ba game da "dabarar Wizard" ba ne fiye da tsarin alama—ƙoƙari na ilimi da gangan don gina "Rosetta Stone" don fassara rikitattun tambayoyin masu shirye-shirye zuwa tsarin harshe wanda injuna za su iya koyo daga gare shi. Wannan shine aikin gindi mara kyau, mai mahimmanci wanda ke gaban kowane aikace-aikacen AI mai ƙarfi, yana maimaita falsafar AI mai daidaitawa da Andrew Ng ya jagoranta.

Kwararar Hankali & Gudummawa: Hankali ba shi da aibi: 1) Matsala: Babu bayanan tattaunawa masu inganci na SE. 2) Hanya: Yi amfani da WoZ don kwaikwayon AI mai kyau, tattara bayanai na halitta. 3) Nazari: Sanya tsari mai tsauri, mai ma'auni da yawa don sanya bayanan su zama masu karantawa ta inji. 4) Sakamako: Bayanai na tushe da tsari don horar da samfura na gaba. Babbar gudummawar ba tattaunawar 30 ba ce; shine tabbacin cewa za a iya kama irin waɗannan tattaunawar da tsari kuma a tsara su. Yana ba da tsarin hanyoyin bincike don ƙirƙirar irin waɗannan bayanan don wasu ayyukan SE (gyara kuskure, ƙira, ƙaura), kamar yadda ImageNet ya ba da samfuri don bayanan gani.

Ƙarfi & Kurakurai: Ƙarfinsa yana cikin tsauraran hanyoyin bincike da hangen nesa. Tsarin alama mai ma'auni huɗu yana da tunani, yana magance duka sassa na aiki (manufa) da ma'anoni (bin sawu na API). Duk da haka, ma'auni shine iyakancewa bayyananne. Masu shirye-shirye 30 da API 2 binciken gwaji ne. Ainihin gwajin shine ma'auni da bambancin: shin tsarin yana riƙe don masu shirye-shirye 300 a cikin API daban-daban 20 (misali, ƙananan API na tsarin da manyan tsarin yanar gizo)? Bugu da ƙari, yayin da hanyar WoZ ke haifar da tambayoyi na halitta, amsoshin "Wizard", ko da yake ƙwararru ne, suna da ma'ana guda ɗaya na yuwuwar son zuciya—"mafi kyawun" amsa bazai zama kawai ko mafi kyau ba. Binciken kuma ya kauce wa babbar ƙalubalen injiniya na haɗa wannan ilimin da aka tsara cikin mataimaki mai sauri, mai ma'auni, ƙalubalen da aka nuna a cikin turawa tsarin kamar IntelliCode na Microsoft.

Fahimta Mai Aiki: Ga masu bincike: Maimaita da ƙara wannan hanyar nan da nan. Filin yana buƙatar "SE-DialogueNet." Ga masu gina kayan aiki: Yi amfani da wannan tsarin alama don daidaitawa ko haɓaka LLMs da ke akwai. Maimakon ƙaddamarwa na gabaɗaya, tsara shigarwa kamar `[Manufa: Nema; Nau'in_Bayani: Siga; Bin_Sawu_zuwa: lib.foo.bar]`. Ga masu samar da API: Wannan binciken shine madauki kai tsaye zuwa dabarun takaddun aikin ku. Ma'aunin "bin sawu" yana taswirar kai tsaye zuwa gibin takaddun aiki. A ƙarshe, wannan aikin yana ba da hujja mai ƙarfi cewa ci gaba na gaba a cikin kayan aikin haɓakawa na AI ba zai zo daga babban LLM na gabaɗaya ba, amma daga samfurin da aka daidaita da ƙwararri akan ingantaccen tarin bayanai, kamar yadda wannan takarda ta fara. Gasar yanzu tana kan gina shi.