Using regular expressions for mining data in large software repositories
The usage of data mining technique in collecting data from software repositories involves the extraction of both basic and value-added information from existing software repositories. Regular Expressions (Regex) provide a mechanism to select specific strings from a set of character strings. In this...
| Main Author: | |
|---|---|
| Format: | Conference or Workshop Item |
| Language: | English English |
| Published: |
IEEE
2014
|
| Subjects: | |
| Online Access: | http://irep.iium.edu.my/42896/ http://irep.iium.edu.my/42896/ http://irep.iium.edu.my/42896/ http://irep.iium.edu.my/42896/6/42896-Using%20Regular%20Expressions%20for%20Mining%20Data%20in%20Large.pdf http://irep.iium.edu.my/42896/7/42896-Using%20Regular%20Expressions%20for%20Mining%20Data%20in%20Large.pdf |
| Summary: | The usage of data mining technique in collecting data from software repositories involves the extraction of both basic and value-added information from existing software repositories. Regular Expressions (Regex) provide a mechanism to select specific strings from a set of character strings. In this paper, we discuss how regular expressions are used to create a data mining tool, known as OSSGrab. We developed the mining tool using Python scripting, in combination with Regex, and as a result, the time spent on data collection can be saved significantly. |
|---|