An Empirical Study of Method Chaining in Java #

Tomoki Nakamaru, Tomomasa Matsunaga, Tetsuro Yamazaki, Soramichi Akiyama, and Shigeru Chiba. The 17th International Conference on Mining Software Repositories (MSR 2020). Seoul, Korea (Online). June 2020.

Important Notice: The MSR 2023 paper by Keshk and Dyer validates and extends our results, and they also provide the replication package of their results. If you are interested in our paper, we strongly encourage you to refer to their paper too.

Problems of our dataset #

Due to bugs in our mining scripts, the dataset does not contain

  • chains containing field accesses or cast expressions; for example, foo.bar.baz(), foo.bar().baz.qux(), and ((Foo) foo.bar()).baz(),
  • chains passed as arguments to non-last method invocations in a chain; for example x.foo() in y.bar(x.foo()).baz().

For this reason, we recommend not to use data.txt in our dataset for further research. These problems may not significantly change the trends shown in our paper, but further investigation is needed to reveal the impacts on our results.

Errata #

  • “longer than 2” → “longer than or equal to 2”
    • 8th line of 2nd paragraph in Section 3
    • Page 95
  • “longer than 2” → “longer than or equal to 2”
    • 2nd line of the last paragraph in Section 3.2
    • Page 96
  • “longer than 8” → “longer than or equal to 8”
    • 7th line of 2nd paragraph in Section 3.3
    • Page 96
  • “longer than 42” → “longer than or equal to 42”
    • 8th line of 2nd paragraph in Section 3.3
    • Page 96
  • “longer than 2” → “longer than or equal to 2”
    • 3rd line of 1st paragraph in RQ1 summary
    • Page 98
  • “longer than 42” → “longer than or equal to 42”
    • 1st line of 3rd paragraph in RQ1 summary
    • Page 98

Additional notes #

  • The number of repositories (Section 2)

    We first collected 2,814 Java repositories that were listed at least once in the most-starred 1,000 Java repositories on GitHub between Nov. 10th and Dec. 21st in 2019. We then extracted Java files from the year-end revisions of those repositories (2018-end, 2017-end, …). Since some repositories did not contain any Java file in/before 2018, the number of repository names listed in metadata.txt is 2,756.

  • The criteria for testing code and non-testing code (Section 3)

    We categorized a Java file as testing code if its path contains “test”, and categorized a file as non-testing code if its path does not contain “test”. These criteria may be inaccurate. However, we assume that most highly-starred projects follow a popular project-structure convention in Java that separates code for testing into “**/test/**/*.java”. So we think that our results shown in Figure 6 are not extremely inaccurate. Nevertheless, further investigation is obviously needed, and this point should have been discussed in “Threats to Validity”.

  • Duplicate/lost projects in our dataset

    Since we naively used the list of repositories, our dataset (unintensionally) contains duplicate repository pairs. Those pairs are listed in “Duplicate pairs” below.

    As of Nov. 2021, we found that 19 repositories are not publicly accessible. Those “lost” repositories are listed in “Lost repositories” below.

Duplicate pairs #

  • alibaba/spring-cloud-alibaba = spring-cloud-incubator/spring-cloud-alibaba
  • apache/incubator-zipkin = openzipkin/zipkin
  • apache/incubator-zipkin-brave = openzipkin/brave
  • atomashpolskiy/bittorrent = atomashpolskiy/bt
  • Exrick/x-boot = Exrick/xboot
  • googlesamples/android-Camera2Basic = googlearchive/android-Camera2Basic
  • googlesamples/android-ConstraintLayoutExamples = googlearchive/android-ConstraintLayoutExamples
  • googlesamples/android-PictureInPicture = googlearchive/android-PictureInPicture
  • googlesamples/android-play-location = android/location-samples
  • googlesamples/android-RuntimePermissions = googlearchive/android-RuntimePermissions
  • googlesamples/android-testing = android/testing-samples
  • jamesdbloom/mockserver = mock-server/mockserver
  • LyndonChin/MasteringAndroidDataBinding = liangfeidotme/MasteringAndroidDataBinding
  • nanchen2251/AiYaCompressHelper = nanchen2251/CompressHelper
  • rubensousa/GravitySnapHelper = rubensousa/RecyclerViewSnap
  • twitter-archive/commons = twitter/commons
  • vondear/RxTool = Tamsiree/RxTool
  • weexteam/hackernews-App-powered-by-Apache-Weex = weexteam/weex-hackernews
  • yahoo/anthelion = YahooArchive/anthelion
  • yu199195/hmily = Dromara/hmily
  • yu199195/Raincat = Dromara/Raincat

Lost repositories #

  • AndroidBootstrap/android-bootstrap
  • ankurkotwal/making-apps-beautiful
  • blynkkk/blynk-server
  • chifei/jweb-cms
  • cloudera/crunch
  • commonsguy/cwac-camera
  • GoranM/bdx
  • kingthy/TVRemoteIME
  • kot32go/KSimpleLibrary
  • layerhq/Atlas-Android
  • psaravan/JamsMusicPlayer
  • qibin0506/MVPro
  • reenWYJ/reen-sharding
  • shouldit/android-proxy
  • SigPloiter/SigPloit
  • streamsets/datacollector
  • Tesco/mewbase
  • wenhuaijun/SearchPictureTool
  • wycm/selenium-geetest-crack

Acknowledgement #

I appreciate Prof. Robert Dyer for his deep inspection of our dataset and results. The content of this page is based on his careful review of our paper.