ࡱ> 9;8bjbjUU D"??g!"!"!"!"!"5"5"5"8m"" 5"$>""""""""w$y$y$y$y$y$y$')Jy$!""""""y$!"!"""$""""!""!""w$""w$""$C$"%5""3$c$$0$;$)")C$)!"C$ """"""""y$y$""""$"""")""""""""" :Resumo No desenvolvimento distribudo de software, muitos projetos tm adotado o fluxo de trabalho gerente de integrao para organizar o processo de colaborao. Nesse fluxo, o desenvolvedor (solicitante) envia sua colaborao por meio de uma solicitao, denominada pull request, que contm ttulo, descrio, autor e o cdigo necessrio para corrigir um bug ou adicionar uma funcionalidade. A solicitao deve ser revisada por um membro da equipe principal do projeto (integrador), que decide rejeitar ou aceitar o cdigo. Em muitos projetos open-source, o processo de integrao de pull requests possui grande demanda de solicitaes e pouca oferta de tempo dos integradores, o que aumenta a quantidade de pull requests abertos e o tempo mdio das integraes. Por outro lado, os integradores esto interessados em reduzir o tempo de integrao dos pull requests e garantir a qualidade do cdigo. Nesse processo de tomada de decises, pode ser extremamente til prever informaes como, por exemplo, o integrador mais apropriado para integrar e o tempo de vida de um pull request. Alguns trabalhos j exploraram esses cenrios de previso. No entanto, esses trabalhos se diferenciam em vrios aspectos referentes aos materiais e mtodos utilizados: conjuntos de atributos preditivos, tcnicas de classificao e regresso, processos experimentais e quantidade de projetos. Nesse contexto, os principais objetivos desta tese so: comparar os diferentes conjuntos de atributos preditivos utilizados em trabalhos anteriores com o conjunto aqui proposto e avaliar se tcnicas de seleo de atributos podem identificar subconjuntos de atributos mais adequados para melhorar o desempenho das tarefas de prever integradores e tempo de vida de pull requests. Nos experimentos, os conjuntos de atributos foram avaliados com diferentes algoritmos de classificao, regresso e estratgias de seleo de atributos. Comparando com abordagens anteriores, nossa proposta para recomendar integradores apresentou a melhor acurcia em 29 dos 32 projetos considerando a recomendao Top-1 e atingiu as melhores mdias de ganho normalizado para as recomendaes Top-1 (19,93%), Top-3 (41,91%) e Top-5 (52,60%). Na previso do tempo de vida, nossa proposta tambm apresentou as melhores mdias de ganho normalizado quando comparada com outras abordagens, obtendo a melhor acurcia em 18 dos 20 projetos utilizados e um ganho normalizado mdio de 14,68%. Palavras-chave: Pull Request; Integradores; Tempo de Vida; Classificao; Regresso e Seleo de Atributos. Abstract In distributed software development, many projects have adopted the integration manager workflow to organize the collaboration process. In this workflow, the developer (requester) sends his or her contribution through a request, called pull request, which contains title, description, author, and the code needed to fix bugs or add features. The pull request must be reviewed by a member of the core team (integrator), who decides to reject or accept it. In many open-source projects, the pull requests integration process has a high demand for requests and a shortage of integrators time, which increases the number of opened pull requests and the average time for integrations. On the other hand, integrators are interested in reducing the time to integrate the pull requests and ensure the code quality. In this decision-making process, it may be extremely useful to predict information, such as the most appropriate integrator to integrate and the lifetime of a pull request. Some researches have already explored these forecasting scenarios. However, these researches differ in many aspects related to the materials and methods used: sets of predictive attributes, classification and regression techniques, experimental processes, and quantity of projects. In this context, the main objectives of this thesis are: to compare the different sets of predictive attributes used in previous works with the set proposed here and to evaluate whether attributes selection techniques can identify more adequate subsets of attributes in order to improve the performance of the tasks of predicting integrators and lifetime of pull requests. In the experiments, the sets of attributes were evaluated with different classification, regression, and attribute selection strategies. Compared to previous approaches, our proposal to recommend integrators showed the best accuracy in 29 out of the 32 projects considering the Top-1 recommendation and reached the best normalized improvement averages for the recommendations Top-1 (19,93%), Top-3 (41,91%), and Top-5 (52,60%). In the prediction of lifetime, our proposal also presented the best normalized improvement averages when compared to other approaches, obtaining the best accuracy in 18 out of the 20 projects used and normalized improvement average of 14,68%. Keywords: Pull Request; Integrators; Lifetime; Classification; Regression and Attribute Selection.      PAGE 2   h k $ / M Z X e 2 > fkv{®®®®®®®®™™™™…w_/h5=h5B*CJ OJQJ\^JaJ phhOJQJ^JmHsH&h5=h5OJQJ\^JmHsH(h5=hCJOJQJ^JaJmHsH&h5=h6OJQJ]^JmHsH h5=hOJQJ^JmHsH(h5=hCJ OJQJ^JaJ mHsH.h5=h5CJ OJQJ\^JaJ mHsH gijlmoprs9$a$,$a$ 2$$VVa$gd5=,,$a$gd5=,$a$gd5={ Mghjkmnpqstz{|}ӲӪhmHnHuhjhUh5=h5OJQJ\^J h5=hCJOJQJ^JaJh5=hOJQJ^JhOJQJ^J)h5B*CJ OJQJ\^JaJ phBP0pf1:p/ =!"#$% Dp^B 666666666vvvvvvvvv66666686666666666666666666666666666666666666666666666666hH6666666666666666666666666666666666666666666666666666666666666666662 0@P`p2( 0@P`p 0@P`p 0@P`p 0@P`p 0@P`p 0@P`p8XV~_HmHnHsHtHX`X Normal *$1$5$%B* CJKH_HaJmH ph sH tHPP 0 Heading 1 & F h@&5CJ \aJ NN 0 Heading 2 & F h@& 56\]HH 0 Heading 3 & F h@&5\VV 0 Heading 4 & F h@&56CJ\]aJPP 0 Heading 5 & F h@&5CJ\aJPP 0 Heading 6 & F h@&5CJ\aJDA D 0Default Paragraph FontRiR 0 Table Normal4 l4a (k ( 0No List l/l VHeading 1 Char75B* CJ KH OJPJQJ\^JaJ mH ph sH tHr/r VHeading 2 Char=56B* CJKHOJPJQJ\]^JaJmH ph sH tHl/l VHeading 3 Char75B* CJKHOJPJQJ\^JaJmH ph sH tHl/!l VHeading 4 Char75B* CJKHOJPJQJ\^JaJmH ph sH tHr/1r VHeading 5 Char=56B* CJKHOJPJQJ\]^JaJmH ph sH tHd/Ad VHeading 6 Char/5B* KHOJPJQJ\^JmH ph sH tH*/Q* 0 WW8Num1z0*/a* 0 WW8Num1z1*/q* 0 WW8Num1z2*/* 0 WW8Num1z3*/* 0 WW8Num1z4*/* 0 WW8Num1z5*/* 0 WW8Num1z6*/* 0 WW8Num1z7*/* 0 WW8Num1z8:/: 0Numbering Symbols>/> 0BulletsCJOJPJQJ^JaJ.X. 0Emphasis6]*W* 0Strong5\./!. 0 Strikeout72/12 0 SuperscriptH*./A. 0 SubscriptH*0/Q0 0 Quotation6]8/a8 0TeletypeOJPJQJ^J6Uq6 0 Hyperlink >*B* ph>/> 0Footnote Characters@&@ 0Footnote ReferenceH*,/, 0 DefinitionJJ 0Heading +$xCJOJQJ^JaJ6B@6 -0 Body Text ,VVV/V ,V0Body Text Char!B* CJKHaJmH ph sH tH$/$ 0List.<"< 0Caption / $xx6]** 0Index0 $HH 0 Quotations177]7^7RO"R 0Preformatted Text2CJOJQJ^JaJB2B 0Definition Term 3VVNBN 0Definition Definition 4^LRL 0Table Contents5 $++]+^+<Qb< 0 Table Heading65\RrR 80 Footnote Text7 $^`CJaJ^/^ 7V0Footnote Text Char!B* CJKHaJmH ph sH tH6 @6 :0Footer9 $ H$P/P 9V0 Footer Char!B* CJKHaJmH ph sH tHNN 0Definition Term Tight ;ssZZ 0Definition Definition Tight <^*L* >0Date=6]L/L =V0 Date Char!B* CJKHaJmH ph sH tH.. 0Author?6] 0Horizontal LineM@ $$d%d&d'dNOPQCJ aJ :: 0First paragraphAPK![Content_Types].xmlj0Eжr(΢Iw},-j4 wP-t#bΙ{UTU^hd}㨫)*1P' ^W0)T9<l#$yi};~@(Hu* Dנz/0ǰ $ X3aZ,D0j~3߶b~i>3\`?/[G\!-Rk.sԻ..a濭?PK!֧6 _rels/.relsj0 }Q%v/C/}(h"O = C?hv=Ʌ%[xp{۵_Pѣ<1H0ORBdJE4b$q_6LR7`0̞O,En7Lib/SeеPK!kytheme/theme/themeManager.xml M @}w7c(EbˮCAǠҟ7՛K Y, e.|,H,lxɴIsQ}#Ր ֵ+!,^$j=GW)E+& 8PK!Ptheme/theme/theme1.xmlYOo6w toc'vuر-MniP@I}úama[إ4:lЯGRX^6؊>$ !)O^rC$y@/yH*񄴽)޵߻UDb`}"qۋJחX^)I`nEp)liV[]1M<OP6r=zgbIguSebORD۫qu gZo~ٺlAplxpT0+[}`jzAV2Fi@qv֬5\|ʜ̭NleXdsjcs7f W+Ն7`g ȘJj|h(KD- dXiJ؇(x$( :;˹! I_TS 1?E??ZBΪmU/?~xY'y5g&΋/ɋ>GMGeD3Vq%'#q$8K)fw9:ĵ x}rxwr:\TZaG*y8IjbRc|XŻǿI u3KGnD1NIBs RuK>V.EL+M2#'fi ~V vl{u8zH *:(W☕ ~JTe\O*tHGHY}KNP*ݾ˦TѼ9/#A7qZ$*c?qUnwN%Oi4 =3ڗP 1Pm \\9Mؓ2aD];Yt\[x]}Wr|]g- eW )6-rCSj id DЇAΜIqbJ#x꺃 6k#ASh&ʌt(Q%p%m&]caSl=X\P1Mh9MVdDAaVB[݈fJíP|8 քAV^f Hn- "d>znNJ ة>b&2vKyϼD:,AGm\nziÙ.uχYC6OMf3or$5NHT[XF64T,ќM0E)`#5XY`פ;%1U٥m;R>QD DcpU'&LE/pm%]8firS4d 7y\`JnίI R3U~7+׸#m qBiDi*L69mY&iHE=(K&N!V.KeLDĕ{D vEꦚdeNƟe(MN9ߜR6&3(a/DUz<{ˊYȳV)9Z[4^n5!J?Q3eBoCM m<.vpIYfZY_p[=al-Y}Nc͙ŋ4vfavl'SA8|*u{-ߟ0%M07%<ҍPK! ѐ'theme/theme/_rels/themeManager.xml.relsM 0wooӺ&݈Э5 6?$Q ,.aic21h:qm@RN;d`o7gK(M&$R(.1r'JЊT8V"AȻHu}|$b{P8g/]QAsم(#L[PK-![Content_Types].xmlPK-!֧6 +_rels/.relsPK-!kytheme/theme/themeManager.xmlPK-!Ptheme/theme/theme1.xmlPK-! ѐ' theme/theme/_rels/themeManager.xml.relsPK] "  !5=gi@@@Unknowng*Ax Times New RomanTimes New Roman5SymbolI. *Cx Arial Helvetica7.@Calibri7@Cambria_ StarSymbolArial Unicode MS?= *Cx Courier NewACambria Math" "Y'"Y'_! 0 $Pg5=!xxResumoHelioHelio Oh+'0x  4 @ LX`hpResumoHelioNormal_WordconvHelio2Microsoft Office Outlook@Ik@>`b%@>`b%_՜.+,0 hp|   Resumo Title  !"#$%&')*+,-./1234567:Root Entry F%<1Table*WordDocumentD"SummaryInformation((DocumentSummaryInformation80CompObjy  F'Microsoft Office Word 97-2003 Document MSWordDocWord.Document.89q  F#Documento do Microsoft Office Word MSWordDocWord.Document.89q