Home > Projects > Clone Detection Literature

Clone Detection Literature

This page lists papers related to clone detection. In addition, links to clone detection tools (standalone and Eclipse plugins), related events, and research groups are also listed. The papers that talk about clone detection techniques are listed first. However, numerous papers that talk about other aspects of clone detection - not just the detection process - are also listed. These include:

If any of the information below is incorrect or out of date, please email . Also, please email any suggestions of other papers. The papers are sorted by year of publication (most recent first).

View this list sorted by: [ Category | Publication Venue | Year | Author ]

Last updated: 08/27/2008

Process

Detection

Detecting Clones in Business Applications
[ PDF ] Jin Guo, Ying Zou – Working Conference on Reverse Engineering (WCRE) – 2008
Cross-Language Clone Detection
Nicholas Kraft, Brandon Bonds, Randy Smith – International Conference on Software Engineering and Knowledge Engineering (SEKE) – 2008
NICAD: Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization
[ PDF ] Chanchal K. Roy, James R. Cordy – International Conference on Program Comprehension (ICPC) – 2008
Clone Detection in Automotive Model-Based Development
[ PDF ] Florian Deissenboeck, Benjamin Hummel, Elmar Juergens, Bernhard Schaetz, Stefan Wagner, Stefan Teuchert, Jean-Francois Girard – International Conference on Software Engineering (ICSE) – 2008
Scalable Detection of Semantic Clones
[ PDF ] Mark Gabel, Lingxiao Jiang, Zhendong Su – International Conference on Software Engineering (ICSE) – 2008
Duplicate Code Detection Using Anti-Unification
[ PDF ] Peter Bulychev, Marius Minea – Spring Young Researchers Colloquium on Software Engineering (SYRCoSE) – 2008
Clone Detection via Structural Abstraction
[ DOI ] William Evans, Christopher Fraser, Fei Ma – Working Conference on Reverse Engineering (WCRE) – 2007
Efficient Token Based Clone Detection with Flexible Tokenization
[ DOI ] Hamid Basit, Simon Pugliesi, William Smyth, Andrei Turpin, Stan Jarzabek – European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) – 2007
DECKARD: Scalable and Accurate Tree-based Detection of Code Clones
[ PDF ] Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, Stephane Glondu – International Conference on Software Engineering (ICSE) – 2007
Clone Detection Using Abstract Syntax Suffix Trees
[ PDF ] Rainer Koschke, Raimar Falke, Pierre Frenzel – Working Conference on Reverse Engineering (WCRE) – 2006
Phoenix-Based Clone Detection Using Suffix Trees
[ PDF ] Robert Tairas, Jeff Gray – ACM-SE Conference – 2006
On the Effectiveness of Clone Detection by String Matching
[ DOI ] Stephane Ducasse, Oscar Nierstrasz, Matthias Rieger – International Journal on Software Maintenance and Evolution: Research and Practice – 2006
SDD: High Performance Code Clone Detection System for Large Scale Source Code
[ DOI ] Seunghak Lee, Iryoung Jeong – Object-Oriented Programming, Systems, Languages, and Applictions (OOPSLA) – 2005
Detecting Higher-level Similarity Patterns in Programs
[ PDF ] Hamid Basit, Stan Jarzabek – European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) – 2005
Archeology of Code Duplication: Recovering Duplication Chains From Small Duplication Fragments
[ PDF ] Richard Wettel, Radu Marinescu – International Symposium on Symbolic and Numeric Algorithms for Scientific Computing – 2005
Clone Detection via Structural Abstraction
[ PDF ] William Evans, Christopher Fraser – Technical Report – 2005
Effective Clone Detection Without Language Barriers
[ PDF ] Matthias Rieger – Ph.D. Thesis – 2005
Method-Level Code Clone Detection on Transformed Abstract Syntax Trees Using Sequence Matching Algorithms
[ PDF ] Kevin Greenan – Student Project Report – 2005
CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code
[ PDF ] Zhenmin Li, Shan Lu, Suvda Myagmar, Yuanyuan Zhou – Symposium on Operating System Design and Implementation – 2004
Practical Language-Independent Detection of Near-Miss
[ PDF ] James R. Cordy, Thomas Dean, Nikita Synytskyy – IBM Centre for Advanced Studies Conference (CASCON) – 2004
Clone Detection in Source Code by Frequent Itemset Techniques
[ DOI ] Vera Wahler, Dietmar Seipel, Gregor Fischer – International Workshop on Source Code Analysis and Manipulation (SCAM) – 2004
Automated Detection Of Code Duplication Clusters
[ PDF ] Richard Wettel – Diploma Thesis – 2004
Are Decomposition Slices Clones?
[ PDF ] Keith Gallagher, Lucas Layman – International Workshop on Program Comprehension (IWPC) – 2003
On Detection of Gapped Code Clones using Gap Locations
[ DOI ] Yasushi Ueda, Toshihiro Kamiya, Shinji Kusumoto, Katsuro Inoue – Asia-Pacific Software Engineering Conference (APSEC) – 2002
CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code
[ PDF ] Toshihiro Kamiya, Shinji Kusumoto, Katsuro Inoue – IEEE Transactions on Software Engineering – 2002
Identification of High-Level Concept Clones in Source Code
[ PDF ] Andrian Marcus, Jonathan Maletic – International Conference on Automated Software Engineering (ASE) – 2001
Identifying Similar Code with Program Dependence Graphs
[ PDF ] Jens Krinke – Working Conference on Reverse Engineering (WCRE) – 2001
Using Slicing to Identify Duplication in Source Code
[ PDF ] Raghavan Komondoor, Susan Horwitz – International Symposium on Static Analysis (SAS) – 2001
A Language Independent Approach for Detecting Duplicated Code
[ PDF ] Stephane Ducasse, Matthias Rieger, Serge Demeyer – International Conference on Software Maintenance (ICSM) – 1999
Clone Detection Using Abstract Syntax Trees
[ PDF ] Ira Baxter, Andrew Yahin, Leonardo Moura, Marcelo Sant'Anna, Lorraine Bier – International Conference on Software Maintenance (ICSM) – 1998
Evaluation Experiments on the Detection of Programming Patterns using Software Metrics
[ PDF ] Kostas Kontogiannis – Working Conference on Reverse Engineering (WCRE) – 1997
Experiment on the Automatic Detection of Function Clones in a Software System Using Metrics
[ DOI ] Jean Mayrand, Claude Leblanc, Ettore Merlo – International Conference on Software Maintenance (ICSM) – 1996
On Finding Duplication and Near-Duplication in Large Software Systems
[ DOI | PS ] Brenda Baker – International Conference on Software Maintenance (ICSM) – 1996
Pattern Matching for Clone and Concept Detection
[ PDF ] Kostas Kontogiannis – Automated Software Engineering – 1996
The Development of a Software Clone Detector
[ PDF ] Neil Davey, Paul Barson, Simon Field, Ray Frank, Stewart Tansley – International Journal of Applied Software Technology – 1995
Substring Matching for Clone Detection and Change Tracking
[ DOI ] John Johnson – International Conference on Software Maintenance (ICSM) – 1994

Analysis

An Empirical Study of Intentional Function Clones in Open Source Software Systems
Chanchal K. Roy, James R. Cordy – Working Conference on Reverse Engineering (WCRE) – 2008
Assessing the Effect of Clones on Changeability
Angela Lozano, Michel Wermelinger – International Conference on Software Maintenance (ICSM) – 2008
Query-based Filtering and Graphical View Generation for Cloning Information
Stan Jarzabek – International Conference on Software Maintenance (ICSM) – 2008
Variation Analysis of Context-Sharing Identifiers with Code Clone
Toshihiro Kamiya – International Conference on Software Maintenance (ICSM) – 2008
Is Cloned Code More Stable than Non-Cloned Code?
Jens Krinke – International Working Conference on Source Code Analysis and Manipulation (SCAM) – 2008
Supporting the Grow-and-Prune Model in Software Product Lines Evolution Using Clone Detection
[ PDF ] Thilo Mende, Felix Beckwermert, Rainer Koschke, Gerald Meier – European Conference on Software Maintenance and Reengineering (CSMR) – 2008
Static Bug Detection Through Analysis of Inconsistent Clones
[ PDF ] Elmar Juergens, Benjamin Hummel, Florian Deissenboeck, Martin Feilkas – Testmethoden fur Software (TESO) – 2008
Applying a Code Clone Detection Method to Domain Analysis of Device Drivers
[ DOI ] Yuseung Ma, Dukkuyn Woo – Asia-Pacific Software Engineering Conference (APSEC) – 2007
Clone Smells in Software Evolution
[ DOI ] Tibor Bakota, Rudolf Ferenc, Tibor Gyimothy – International Conference on Software Maintenance (ICSM) – 2007
A Study of Consistent and Inconsistent Changes to Code Clones
[ PDF ] Jens Krinke – Working Conference on Reverse Engineering (WCRE) – 2007
Context-Based Detection of Clone-Related Bugs
[ PDF ] Lingxiao Jiang, Zhendong Su, Edwin Chiu – European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) – 2007
Finding Clones with Dup: Analysis of an Experiment
[ DOI ] Brenda Baker – IEEE Transactions on Software Engineering – 2007
A Framework for Studying Clones in Large Software Systems
[ PDF ] Zhenming Jiang, Ahmed Hassan – International Working Conference on Source Code Analysis and Manipulation (SCAM) – 2007
SoftGUESS: Visualization and Exploration of Code Clones in Context
[ PDF ] Eytan Adar, Miryung Kim – International Conference on Software Engineering (ICSE) – 2007
Using Server Pages to Unify Clones in Web Applications: A Trade-off Analysis
[ PDF ] Damith Rajapakse, Stan Jarzabek – International Conference on Software Engineering (ICSE) – 2007
Very-Large Scale Code Clone Analysis and Visualization of Open Source
[ DOI ] Simone Livieri, Yoshiki Higo, Makoto Matsushita, Katsuro Inoue – International Conference on Software Engineering (ICSE) – 2007
Analysis of the Linux Kernel Evolution Using Code Clone Coverage
[ DOI ] Simone Livieri, Yoshiki Higo, Makoto Matsushita, Katsuro Inoue – International Workshop on Mining Software Repositories (MSR) – 2007
Evaluating the Harmfulness of Cloning: A Change Based Experiment
[ DOI ] Angela Lozano, Michel Wermelinger, Bashar Nuseibeh – International Workshop on Mining Software Repositories (MSR) – 2007
How Clones are Maintained: An Empirical Study
[ DOI ] Lerina Aversano, Luigi Cerulo, Massimiliano Di Penta – European Conference on Software Maintenance and Reengineering (CSMR) – 2007
Visualizing and Understanding Code Duplication in Large Software Systems
[ PDF ] Zhenming Jiang – Masters Thesis – 2006
"Cloning Considered Harmful" Considered Harmful
[ PDF ] Cory Kapser, Michael Godfrey – Working Conference on Reverse Engineering (WCRE) – 2006
Visualization of Clone Detection Results
[ PDF ] Robert Tairas, Jeff Gray, Ira Baxter – Eclipse Technology Exchange Workshop (ETX) – 2006
Supporting the Analysis of Clones in Software Systems
[ PDF ] Cory Kapser, Michael Godfrey – International Journal on Software Maintenance and Evolution: Research and Practice – 2006
Cloning by Accident: An Empirical Study of Source Code Cloning Across Software Systems
[ PDF ] Raihan Al-Ekram, Cory Kapser, Richard Holt, Michael Godfrey – International Symposium on Empirical Software Engineering – 2005
An Empirical Study of Code Clone Genealogies
[ PDF ] Miryung Kim, Vibha Sazawal, David Notkin, Gail Murphy – European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) – 2005
Improved Tool Support for the Investigation of Duplication in Software
[ PDF ] Cory Kapser, Michael Godfrey – International Conference on Software Maintenance (ICSM) – 2005
An Empirical Study on Limits of Clone Unification Using Generics
[ PDF ] Hamid Basit, Damith Rajapakse, Stan Jarzabek – International Conference on Software Engineering and Knowledge Engineering (SEKE) – 2005
An Investigation of Cloning in Web Applications
[ PDF ] Hamid Basit, Damith Rajapakse, Stan Jarzabek – International Conference on Web Engineering – 2005
Beyond Templates: a Study of Clones in the STL and Some General Implications
[ PDF ] Hamid Basit, Damith Rajapakse, Stan Jarzabek – International Conference on Software Engineering (ICSE) – 2005
Using a Clone Genealogy Extractor for Understanding and Supporting Evolution of Code Clones
[ PDF ] Miryung Kim, David Notkin – International Workshop on Mining Software Repositories (MSR) – 2005
Insights into System--Wide Code Duplication
[ PDF ] Matthias Rieger, Stephane Ducasse, Michele Lanza – Working Conference on Reverse Engineering (WCRE) – 2004
Aiding Comprehension of Cloning Through Categorization
[ PDF ] Cory Kapser – International Workshop on Principles of Software Evolution – 2004
Studying Software Evolution Using Clone Detection
[ PDF ] Filip Van Rysselberghe, Serge Demeyer – International Workshop on Object-Oriented Reengineering – 2004
Analyzing Cloning Evolution in the Linux Kernel
[ PDF ] Giuliano Antoniol, Umberto Villano, Ettore Merlo, Massimiliano Di Penta – Information and Software Technology – 2002
Software Quality Analysis by Code Clones in Industrial Legacy Software
[ PDF ] Akito Monden, Daikai Nakae, Toshihiro Kamiya, Shin-ichi Sato, Ken-ichi Matsumoto – Symposium on Software Metrics (METRICS) – 2002
Modeling Clones Evolution Through Time Series
[ PDF ] Giuliano Antoniol, Gerardo Casazza, Massimiliano Di Penta, Ettore Merlo – International Conference on Software Maintenance (ICSM) – 2001
Measuring Clone Based Reengineering Opportunities
[ PDF | PS ] Magdalena Balazinska, Ettore Merlo, Michel Dagenais, Bruno Lague, Kostas Kontogiannis – International Software Metrics Symposium – 1999
Visualizing Textual Redundancy in Legacy Source
John Johnson – IBM Centre for Advanced Studies Conference (CASCON) – 1994

Maintenance

CloneTracker: Tool Support for Code Clone Management
[ PDF ] Ekwa Duala-Ekoko, Martin Robillard – International Conference on Software Engineering (ICSE) – 2008
CPC: An Eclipse Framework for Automated Clone Life Cycle Tracking and Update Anomaly Detection
[ PDF ] Valentin Weckerle – Diploma Thesis – 2008
Simultaneous Modification Support based on Code Clone Analysis
[ PDF ] Yoshiki Higo, Yasushi Ueda, Shinji Kusumoto, Katsuro Inoue – Asia-Pacific Software Engineering Conference (APSEC) – 2007
CReN: A Tool for Tracking Copy-and-Paste Code Clones and Renaming Identifiers Consistently in the IDE
[ PDF ] Patricia Jablonski, Daqing Hou – Eclipse Technology Exchange Workshop (ETX) – 2007
Tracking Code Clones in Evolving Software
[ PDF ] Ekwa Duala-Ekoko, Martin Robillard – International Conference on Software Engineering (ICSE) – 2007
Beyond Clone Detection
[ PDF ] Andy Chiu, David Hirtle – Student Project Report – 2007
Bumbo III: Clone Trouble
[ URL ] 2007
Code Clone Analysis Methods for Efficient Software Maintenance
[ PDF ] Yoshiki Higo – Ph.D. Thesis – 2006
An Algorithm for Detecting and Removing Clones in Java Code
[ PDF ] Nicolas Juillerat, Beat Hirsbrunner – Workshop on Software Evolution through Transformations (SeTra) – 2006
A Novel Approach to Optimize Clone Refactoring Activity
[ DOI ] Salah Bouktif, Giuliano Antoniol, Ettore Merlo, Markus Neteler – Genetic and Evolutionary Computation Conference – 2006
Unifying Clones with a Generative Programming Technique: A Case Study
[ PDF ] Stan Jarzabek, Shubiao Li – International Journal on Software Maintenance and Evolution: Research and Practice – 2006
Managing Duplicated Code with Linked Editing
[ PDF ] Michael Toomim, Andrew Begel, Susan Graham – Symposium on Visual Languages - Human Centric Computing – 2004
Semi Automatic Removal of Duplicated Code
[ PDF ] Yidong Liu – Diploma Thesis – 2004
ARIES: Refactoring Support Environment based on Code Clone Analysis
[ PDF ] Yoshiki Higo, Toshihiro Kamiya, Shinji Kusumoto, Katsuro Inoue – International Conference on Software Engineering and Applications – 2004
Automated Duplicated-Code Detection and Procedure Extraction
[ PDF ] Raghavan Komondoor – Ph.D. Thesis – 2003
A Scenario Based Approach for Refactoring Duplicated Code in Object Oriented Systems
[ PDF ] Georges Koni N'Sapu – Diploma Thesis – 2001
Advanced Clone-analysis to Support Object-oriented System Refactoring
[ PDF | PS ] Magdalena Balazinska, Ettore Merlo, Michel Dagenais, Bruno Lague, Kostas Kontogiannis – Working Conference on Reverse Engineering (WCRE) – 2000
Partial Redesign of Java Software Systems Based on Clone Analysis
[ PDF | PS ] Magdalena Balazinska, Ettore Merlo, Michel Dagenais, Bruno Lague, Kostas Kontogiannis – Working Conference on Reverse Engineering (WCRE) – 1999
Removing Clones from the Code
[ PDF ] Richard Fanta, Vaclav Rajlich – Journal of Software Maintenance: Research and Practice – 1999

Surveys and Evaluations

Survey of Overall Research

A Survey on Software Clone Detection Research
[ PDF ] Chanchal K. Roy, James R. Cordy – Technical Report – 2007
Survey of Research on Software Clones
[ PDF ] Rainer Koschke – Dagstuhl Seminar: Duplication, Redundancy, and Similarity in Software – 2006

Evalution of Clone Detection Tools

Scenario-based Comparison of Clone Detection Techniques
[ PDF ] Chanchal K. Roy, James R. Cordy – International Conference on Program Comprehension (ICPC) – 2008
Towards a Mutation-Based Automatic Framework for Evaluating Code Clone Detection Tools
[ PDF ] Chanchal K. Roy, James R. Cordy – Canadian Conference on Computer Science and Software Engineering (C3S2E) – 2008
Comparison and Evaluation of Clone Detection Tools
[ DOI ] Stefan Bellon, Rainer Koschke, Giuliano Antoniol, Jens Krinke, Ettore Merlo – IEEE Transactions on Software Engineering – 2007
Evaluating Clone Detection Techniques
[ PDF ] Filip Van Rysselberghe, Serge Demeyer – International Workshop on Evolution of Large Scale Industrial Applications (ELISA) – 2003
Detection of Software Clones Tool Comparison Experiment
[ PDF ] Stefan Bellon – International Workshop on Source Code Analysis and Manipulation (SCAM) – 2002
Evaluating Clone Detection Tools for Use during Preventative Maintenance
[ PDF ] Elizabeth Burd, John Bailey – International Workshop on Source Code Analysis and Manipulation (SCAM) – 2002

Tools

Standalone Tools

Axivion Bauhaus Suite
[ URL ]
CCFinder
[ URL ]
CloneDR
[ URL ]
Clone Digger
[ URL ]
Clone Detective (part of ConQAT)
[ URL ]
Copy Paste Detector
[ URL ]
Duplo
[ URL ]
Simian
[ URL ]

Eclipse Plug-ins

CloneTracker
[ URL ]
Consistent Renaming Tool (CReN)
[ URL ]
CopyPasteChange (CPC)
[ URL ]
Duplication Management Framework
[ URL ]
SDD
[ URL ]
SimScan
[ URL ]

In Visual Studio

Clone Detective (part of ConQAT)
[ URL ]

Related Topics

Reference Data

Problems Creating Task-relevant Clone Detection Reference Data
[ PDF ] Andrew Walenstein, Nitin Jyoti, Junwei Li, Yun Yang, Arun Lakhotia – Working Conference on Reverse Engineering (WCRE) – 2003

Copy and Paste Practices

Three Public Enemies: Cut, Copy, and Paste
[ DOI ] Zoltan Mann – IEEE Computer – 2006
How Developers Copy
[ PDF ] Mihai Balint, Tudor Girba, Radu Marinescu – International Conference on Program Comprehension (ICPC) – 2006
An Ethnographic Study of Copy and Paste Programming Practices in OOPL
[ PDF ] Miryung Kim, Lawrence Bergman, Tessa Lau, David Notkin – Symposium on Empirical Software Engineering – 2004

Aspect Mining

Mining Coding Patterns to Detect Crosscutting Concerns in Java Programs
Takashi Ishio, Hironori Date, Tatsuya Miyake, Katsuro Inoue – Working Conference on Reverse Engineering (WCRE) – 2008
Pitfalls in Aspect Mining
Kim Mens, Andy Kellens, Jens Krinke – Working Conference on Reverse Engineering (WCRE) – 2008
Aspect Mining from a Modelling Perspective
[ DOI ] Jing Zhang, Yuehua Lin, Jeff Gray, Robert Tairas – International Journal of Computer Applications in Technology – 2008
Evaluating Aspect Mining Techniques: A Case Study
[ DOI ] Chanchal K. Roy, Mohammad Gias Uddin, Banani Roy, Thomas Dean – International Conference on Program Comprehension (ICPC) – 2007
HAM: Cross-Cutting Concerns in Eclipse
[ PDF ] Silvia Breu, Thomas Zimmerman, Christian Lindig – Eclipse Technology Exchange Workshop (ETX) – 2006
Mining Aspects from Version History
[ PDF ] Silvia Breu, Thomas Zimmerman – International Conference on Automated Software Engineering (ASE) – 2006
On the Use of Clone Detection for Identifying Crosscutting Concern Code
[ DOI ] Magiel Bruntink, Arie van Deursen, Remco van Engelen, Tom Tourwe – IEEE Transactions on Software Engineering – 2005
A Survey of Aspect Mining Tools and Techniques
[ PDF ] Andy Kellens, Kim Mens – Technical Report – 2005
Towards Hybrid Aspect Mining: Static Extensions to Dynamic Aspect Mining
[ PDF ] Silvia Breu – Position Paper – 2004
Aspect Mining using Clone Class Metrics
[ PDF ] Magiel Bruntink – Workshop on Aspect Reverse Engineering – 2004
Control-Flow-Graph-Based Aspect Mining
[ PDF ] Jens Krinke, Silvia Breu – Workshop on Aspect Reverse Engineering – 2004
An Evaluation of Clone Detection Techniques for Identifying Crosscutting Concerns
[ PDF ] Magiel Bruntink, Arie van Deursen, Remco van Engelen, Tom Tourwe – International Conference on Software Maintenance (ICSM) – 2004

Related Links

Events

Duplication, Redundancy, and Similarity in Software
[ URL ] Dagstuhl Seminar - July 23-26, 2006
Towards Evaluation of Aspect Mining Workshop
Held in conjunction with ECOOP 2006 - July 4, 2006
Second International Workshop on Detection of Software Clones
[ URL ] Held in conjunction with WCRE 2003 - November 13, 2003
First International Workshop on Detection of Software Clones
[ URL ] Held before ICSM 2002 - October 2, 2002

Research Groups

Software Architecture Group (SWAG)
Software Composition Group (SCG)
Software Engineering Laboratory
Software Evolution Research Group (SWEVO)
Software Research Laboratory
This project is supported by NSF grant CPA-0702764