1. Gabatarwa
Ƙungiyar PUNCH4NFDI (Barbashi, Sararin Samaniya, Tsakiya da Hadrons don Tsarin Bayanan Bincike na Ƙasa), wadda Gidauniyar Bincike ta Jamus (DFG) ta ba da kuɗi, tana wakiltar kimanin masana kimiyya 9,000 daga al'ummomin kimiyyar barbashi, taurari, barbashi-taurari, hadron, da makaman nukiliya. An saka shi a cikin babban shirin NFDI, babban manufarsa shine kafa dandalin bayanan kimiyya na haɗin kai da FAIR (Ana iya Samuwa, Samuwa, Haɗin Kai, Ana iya Amfani da su). Wannan dandalin yana nufin samar da damar shiga cikin nau'ikan albarkatun kwamfuta da ajiya na cibiyoyin da ke ciki, yana magance matsalolin gama gari da girma mai yawa na bayanai da algorithms na nazari masu ƙarfi ke haifarwa. Wannan takarda ta mai da hankali kan ra'ayoyin gine-gine—Compute4PUNCH da Storage4PUNCH—waɗanda aka ƙera don haɗa tsarin bincike na Jamus.
2. Tsarin Kwamfuta na Haɗin Kai – Compute4PUNCH
Compute4PUNCH yana magance ƙalubalen yin amfani da iko da yawa na albarkatun da aka ba da gudummawa, gami da Kwamfuta mai Gudanar da Aiki mai Yawa (HTC), Kwamfuta mai Gudanar da Aiki mai Girma (HPC), da tsarin Girgije, waɗanda aka rarraba a duk faɗin Jamus. Waɗannan albarkatun sun bambanta a cikin gine-gine, tsarin aiki, tarin software, da manufofin shiga. Babban ƙa'idar ƙira shine ƙirƙirar tsarin rufi ɗaya tare da ƙaramin kutsawa ga masu samar da albarkatun da ake aiki da su.
2.1. Tsarin Tsakiya & Haɗawa
An gina haɗin gwiwa a kusa da HTCondor a matsayin tsarin tsarin rufi na tsakiya. Ana haɗa albarkatun daban-daban ta amfani da COBalD/TARDIS mai tsara albarkatun meta. COBalD/TARDIS yana aiki azaman mai shiga tsakani mai hankali, yana jagorantar ayyuka zuwa bayanan da suka dace (misali, Slurm, gungu na Kubernetes) dangane da samun albarkatu, buƙatun aiki, da manufofi. Wannan yana haifar da tafki ɗaya na albarkatu na hankali daga tsarin da suka bambanta a zahiri.
2.2. Shiga Mai Amfani & Yanayin Software
Ana ba da hanyoyin shiga mai amfani ta hanyar nodes na shiga na gargajiya da sabis na JupyterHub. Tsarin Tabbatar da Shiga da Izinin Shiga (AAI) na tushen alama yana daidaita shiga. Ana sarrafa rikitarwar yanayin software ta hanyar fasahohin kwantena (misali, Docker, Singularity/Apptainer) da Tsarin Fayil na Na'ura Mai Kwakwalwa ta CERN (CVMFS), wanda ke isar da rarraba software mai iya faɗaɗawa, karantacce zuwa nodes na kwamfuta a duniya.
3. Tsarin Ajiya na Haɗin Kai – Storage4PUNCH
Storage4PUNCH yana nufin haɗa tsarin ajiya da al'umma ke bayarwa, musamman bisa dCache da XRootD fasahohin, waɗanda suka kafu a cikin Kimiyyar Lantarki mai Girma (HEP). Haɗin gwiwar yana amfani da sunaye na gama gari da ka'idoji (kamar xrootd, WebDAV) don gabatar da layer na samun bayanai ɗaya. Har ila yau, ra'ayin yana kimanta haɗa mafita na ajiya da ayyukan sarrafa metadata don inganta wurin bayanai da ganowa a cikin haɗin gwiwar.
4. Aiwatar da Fasaha & Abubuwan Haɗin Kai
4.1. Tabbatar da Shiga & Izinin Shiga (AAI)
AAI na tushen alama (mai yiwuwa yana amfani da ma'auni na OAuth 2.0/OpenID Connect, kama da WLCG IAM ko INDIGO IAM) yana ba da gogewar shiga ɗaya. Yana siffanta ainihin al'umma zuwa izinin albarkatun gida, yana kawar da tsarin tabbatar da shiga na gida daban-daban (misali, Kerberos, maɓallan SSH).
4.2. Tsarin Tsara Albarkatu: COBalD/TARDIS
COBalD (Mai Gudanarwa) da TARDIS (Tsarin Haɗin Kai na Albarkatu Mai Sauƙi) suna aiki tare. COBalD yana yanke shawara na tsari mai girma, yayin da TARDIS ke sarrafa tsarin rayuwar "matukin jirgi" ko "ayyuka na wuri" akan albarkatun da aka yi niyya. Wannan rabuwa yana ba da damar aiwatar da manufofi masu sassauƙa (misali, farashi, adalci, fifiko) da daidaitawa da yanayin albarkatun da ke canzawa. Ana iya ƙirƙira tsarin tsara aiki a matsayin matsala mai inganci, yana rage aikin farashi $C_{total} = \sum_{i}(w_i \cdot T_i) + \lambda \cdot R$, inda $T_i$ shine lokacin juyawa don aikin $i$, $w_i$ shine nauyin fifikonsa, $R$ yana wakiltar farashin amfani da albarkatu, kuma $\lambda$ shine ma'auni mai daidaitawa.
4.3. Layer na Bayanai & Software
CVMFS yana da mahimmanci don rarraba software. Yana amfani da samfurin ajiya mai adireshin abun ciki da ajiya mai ƙarfi (tare da uwar garken 0/1 da ajiyar Squid na gida) don isar da rumbunan software yadda ya kamata. Haɗin gwiwar yana iya amfani da matakan CVMFS, tare da tsakiyar rumbun PUNCH stratum 0 da madubin stratum 1 na cibiyar. Samun bayanai yana bin samfurin haɗin gwiwa iri ɗaya, tare da abubuwan ajiya (SEs) suna buga ƙarshensu zuwa babban kundin adireshi (kamar Rucio ko sabis na REST mai sauƙi), yana ba abokan ciniki damar warware wuraren bayanai a bayyane.
5. Matsayin Samfurin Farko & Kwarewar Farko
Takardar ta nuna cewa samfuran farko na Compute4PUNCH da Storage4PUNCH suna aiki. An aiwatar da aikace-aikacen kimiyya na farko, suna ba da amsa mai mahimmanci game da aiki, amfani, da matsalolin haɗawa. Duk da yake ba a ba da takamaiman lambobin benchmark a cikin ɓangaren da aka cire ba, nasarar aiwatarwa tana nuna aikin asali na tsarin tsarin rufi, haɗin AAI, da isar da software ta hanyar CVMFS an tabbatar da su. Kwarewar tana jagorantar gyare-gyare a cikin tsarin manufofi, sarrafa kuskure, da takaddun mai amfani.
6. Muhimman Fahimta & Nazarin Dabarun
Babban Fahimta: PUNCH4NFDI ba ta gina sabon babban kwamfuta ba; tana injiniyan "yadudduka na haɗin gwiwa" wanda ke haɗa albarkatun da ke akwai, da suka rabu, cikin hikima. Wannan canji ne na dabarun daga babban tsarin gine-gine zuwa tarawa na albarkatu mai sauri, wanda software ya ayyana, yana kwatanta yanayin kasuwanci na girgije amma an daidaita shi don ƙuntatawa da al'adun ilimi masu kuɗin jama'a.
Kwararar Hankali: Tsarin gine-ginen yana bin bayyanannen hankali, mai dogaro da dogaro: 1) Haɗa Ainihi (AAI) don warware matsalar "wa", 2) Albarkatun Abubuwan da ba a sani ba (COBalD/TARDIS + HTCondor) don warware matsalar "ina", da 3) Raba Muhalli (Kwantena + CVMFS) don warware matsalar "da me". Wannan rarrabuwar kawuna na layi shine aikin injiniyan tsarin littafin karatu, mai tunawa da nasarar Grid na Kwamfuta na Duniya na LHC (WLCG), amma an yi amfani da shi ga tarin albarkatu mafi bambanta.
Ƙarfi & Kurakurai: Babban ƙarfinsa shine samfurin karɓuwa mara rushewa. Ta hanyar amfani da fasahohin rufi da mutunta 'yancin kai na shafin, yana rage shingen masu samar da albarkatu—muhimmin abin nasara ga ƙungiyoyin haɗin gwiwa. Duk da haka, wannan shi ma ƙafar Achilles ne. Matsakaicin aikin tsara meta da rikitarwar asali na gyara kuskure a cikin tsarin da ke da ikon gudanarwa, masu zaman kansu na iya zama mahimmanci. Umarnin "ƙaramin kutsawa" na iya iyakance ikon aiwatar da fasali na ci gaba kamar haɗin ajiya-kwamfuta mai zurfi ko samar da hanyar sadarwa mai ƙarfi, mai yuwuwa yana iyakance ribar inganci. Idan aka kwatanta da tsarin da aka ƙera da niyya, na tsakiya kamar Borg na Google ko gungu na Kubernetes, haɗin gwiwar koyaushe zai sami jinkiri mafi girma da ƙarancin hasashen amfani.
Fahimta Mai Aiki: Ga sauran ƙungiyoyin haɗin gwiwar da ke la'akari da wannan hanya: 1) Zuba kuɗi mai yawa a cikin sa ido da lura tun daga ranar ɗaya. Kayan aiki kamar Grafana/Prometheus don kayan aiki da APM (Kula da Aikin Aikace-aikace) don ayyukan mai amfani ba za a iya yin shawarwari ba don sarrafa rikitarwa. 2) Daidaitu akan ƙananan hotunan tushen kwantena don rage nauyin kula da CVMFS. 3) Ƙirƙirar samfurin tallafi mai bayyananne, mai matakai wanda ke bambanta matsalolin matakin haɗin gwiwa da matsalolin shafin gida. Gwaji na gaske ba zai zama yuwuwar fasaha ba—al'ummar HEP sun tabbatar da hakan—amma dorewar aiki da gamsuwar mai amfani a ma'auni.
7. Zurfin Fasaha
Samfurin Lissafi don Tsara Albarkatu: Ana iya tunanin tsarin COBalD/TARDIS a matsayin warware matsala mai iyakance. Bari $J$ ya zama saitin ayyuka, $R$ ya zama saitin albarkatu, kuma $S$ ya zama saitin yanayin albarkatu (misali, mara aiki, aiki, bushe). Mai tsara yana nufin haɓaka aikin amfani $U$ wanda ke la'akari da fifikon aikin $p_j$, ingancin albarkatu $e_{j,r}$, da farashi $c_r$: $$\max \sum_{j \in J} \sum_{r \in R} x_{j,r} \cdot U(p_j, e_{j,r}, c_r)$$ bisa ga ƙuntatawa: $$\sum_{j} x_{j,r} \leq C_r \quad \forall r \in R \quad \text{(Ƙarfin Albarkatu)}$$ $$\sum_{r} x_{j,r} \leq 1 \quad \forall j \in J \quad \text{(Aikin Aiki)}$$ $$x_{j,r} \in \{0,1\} \quad \text{(Mai yanke shawara na Binary)}$$ inda $x_{j,r}=1$ idan an sanya aikin $j$ zuwa albarkatu $r$. TARDIS yana sarrafa yuwuwar ayyuka bisa ga yanayin ainihi $S$.
Sakamakon Gwaji & Bayanin Zane: Duk da yake ɓangaren PDF da aka bayar bai ƙunshi takamaiman zane-zane na aiki ba, kimantawa na yau da kullun zai haɗa da zane-zane masu kwatanta:
1. Ƙarfin Aikin Aiki: Ginshiƙi mai nuna adadin ayyukan da aka kammala a kowace awa a cikin tafkin haɗin gwiwa da gungu na albarkatu ɗaya, yana nuna fa'idar tarawa.
2. Zanen Zafi na Amfani da Albarkatu: Hoto na grid wanda ke nuna kashi na CPUs/GPUs da aka yi amfani da su a cikin masu samar da albarkatu daban-daban (KIT, DESY, Bielefeld, da sauransu) cikin mako guda, yana nuna tasirin daidaita kaya.
3. Jinkirin Farawa na Aikin CDF: Taswira na Ayyukan Rarraba mai tarawa yana kwatanta lokacin daga ƙaddamar da aikin zuwa farawar aiwatarwa a cikin tsarin haɗin gwiwa da ƙaddamarwa kai tsaye zuwa tsarin tsarin gida, yana bayyana matsakaicin tsarin tsara meta.
4. Aikin Samun Bayanai: Ginshiƙi mai kwatanta saurin karanta/rubuta don bayanan da aka samu a gida, daga wani abu na ajiya na haɗin gwiwa a cikin yanki ɗaya, kuma daga wani abu mai nisa na haɗin gwiwa, yana kwatanta tasirin ajiya da hanyar sadarwa.
8. Tsarin Nazari & Samfurin Ra'ayi
Nazarin Shari'a: Nazarin Haɗin Kai na Bayanan Binciken Taurari
Yanayi: Ƙungiyar bincike a Thüringer Landessternwarte Tautenburg tana buƙatar sarrafa bayanan hoto na 1 PB daga Binciken Sloan Digital Sky Survey (SDSS) don gano gungu na taurari, aiki mai ƙarfi na kwamfuta yana buƙatar ~100,000 CPU-hours.
Tsari ta hanyar Compute4PUNCH/Storage4PUNCH:
1. Tabbatar da Shiga: Masanin binciken ya shiga cikin PUNCH JupyterHub ta amfani da takaddun shaida na cibiyar (ta hanyar AAI na tushen alama).
2. Yanayin Software: Kernel na littafin rubutu na Jupyter yana gudana daga hoton kwantena da ke kan CVMFS, wanda ya ƙunshi duk muhimman fakitin taurari (Astropy, SExtractor, da sauransu).
3. Ma'anar Aiki & Ƙaddamarwa: Sun ayyana aikin share sigogi a cikin littafin rubutu. Littafin rubutu yana amfani da ɗakin karatu na abokin ciniki na PUNCH don ƙaddamar da waɗannan a matsayin HTCondor DAG (Zane mai Jagora) zuwa tafkin haɗin gwiwa.
4. Daidaita Albarkatu & Aiwatarwa: COBalD/TARDIS yana kimanta buƙatun aikin (CPU, ƙwaƙwalwar ajiya, mai yiwuwa GPU) kuma yana tuƙa su zuwa ramummuka masu samuwa a ko'ina, misali, tafkunan HTC a KIT, layukan HPC a Jami'ar Bielefeld, da nodes na girgije a DESY. Ayyuka suna karanta bayanan shigarwa ta hanyar sunan sararin samaniya na XRootD na haɗin gwiwa daga wurin ajiya mafi kusa, mai yiwuwa yana amfani da ma'ajiya.
5. Tarawa na Sakamako: Ana rubuta fayilolin fitarwa zuwa ajiyar haɗin gwiwa. Masanin binciken yana sa ido kan ci gaba ta hanyar dashboard na gidan yanar gizo ɗaya kuma a ƙarshe yana tattara sakamakon a cikin littafin rubutunsa don bincike.
Wannan shari'ar tana nuna haɗin kai na ainihi, kwamfuta, ajiya, da sarrafa software.
9. Aikace-aikace na Gaba & Taswirar Ci Gaba
Kayan aikin PUNCH4NFDI sun kafa harsashi don aikace-aikace masu ci gaba da yawa:
1. Horar da Injin Koyo na Haɗin Kai: Tafkin albarkatu daban-daban, gami da gungu na GPU masu yuwuwa, zai iya tallafawa tsarin horar da ML da aka rarraba kamar PyTorch ko TensorFlow a kan iyakokin cibiyoyi, yana magance buƙatun horo mai kiyaye sirri inda ba za a iya tattara bayanai ba.
2. Nazari Mai Mu'amala & Hoto: Haɓaka sabis na JupyterHub tare da kayan aikin hoto masu mu'amala masu iya faɗaɗawa, masu ƙarfi na baya (misali, kayan aikin Jupyter da aka haɗa zuwa gungu na Dask akan haɗin gwiwar) don binciken babban bayanai.
3. Haɗawa tare da Girgije na Waje & Cibiyoyin HPC: Ƙaddamar da samfurin haɗin gwiwa don haɗa ƙididdiga na girgije na kasuwanci (misali, AWS, GCP) ko manyan cibiyoyin kwamfuta na ƙasa (misali, JUWELS a JSC) ta hanyar layer na lissafin kuɗi/lissafin kuɗi na gama gari, ƙirƙirar girgije na gaske na haɗin gwiwa don kimiyya.
4. Haɗin Metadata da Tafkin Bayanai: Ƙaura fiye da sauƙin haɗin fayil zuwa tsarin gine-ginen tafkin bayanai, inda layer na ajiya yake haɗe tare da kundin metadata ɗaya (misali, bisa Rucio ko iRODS), yana ba da damar gano bayanai da bin diddigin asali a cikin al'ummomi.
5. Aikin Aiki-a matsayin-Sabisi: Bayar da sabis na dandamali mafi girma kamar REANA (Dandalin Nazari Mai Maimaitawa) ko Apache Airflow a saman kayan aikin haɗin gwiwa, yana ba masana kimiyya damar ayyana da aiwatar da rukunan nazari masu rikitarwa, masu maimaitawa ba tare da sarrafa kayan aikin da ke ƙasa ba.
Taswirar ci gaban za ta mai da hankali kan ƙarfafa sabis ɗin samarwa, faɗaɗa tafkin albarkatu, haɗa ƙarin kayan aikin sarrafa bayanai masu zurfi, da haɓaka APIs da SDKs masu dacewa da mai amfani don rage shingen karɓa ga masu amfani marasa ƙwarewa.
10. Nassoshi
- Ƙungiyar Haɗin gwiwar PUNCH4NFDI. (2024). Takarda mai Farin Ciki na PUNCH4NFDI. [Takardar Ƙungiyar Haɗin gwiwar Ciki].
- Thain, D., Tannenbaum, T., & Livny, M. (2005). Kwamfuta da aka rarraba a aikace: gwanin Condor. Haɗin kai - Kwarewa da Kwarewa, 17(2-4), 323-356. https://doi.org/10.1002/cpe.938
- Blomer, J., et al. (2011). Rarraba software a cikin tsarin fayil na CernVM tare da Parrot. Jaridar Physics: Taron, 331(4), 042009. https://doi.org/10.1088/1742-6596/331/4/042009
- Giffels, M., et al. (2022). COBalD da TARDIS – Tsarin rufi na albarkatu mai ƙarfi don kwamfuta mai dama. Taron Gidan yanar gizo na EPJ, 251, 02009. https://doi.org/10.1051/epjconf/202225102009
- Haɗin gwiwar dCache. (2023). dCache: Tsarin ajiya na bayanai da aka rarraba. An samo daga https://www.dcache.org/
- Haɗin gwiwar XRootD. (2023). XRootD: Babban aiki, samun bayanai mai iya faɗaɗawa mai kuskure. An samo daga http://xrootd.org/
- Wilkinson, M. D., et al. (2016). Ka'idojin Jagorar FAIR don sarrafa bayanan kimiyya da kula da su. Bayanan Kimiyya, 3, 160018. https://doi.org/10.1038/sdata.2016.18
- Verma, A., et al. (2015). Sarrafa gungu mai girma a Google tare da Borg. Proceedings na Taron Kwamfuta na Turai na Goma (EuroSys '15). https://doi.org/10.1145/2741948.2741964