This project contains the implementation and data of the course project called "A Deep Dive into bug reports with Stack Traces in Production-like scenarios".
The scripts are organized as follows:
- getBugReportsList_Jira - Extracts all the Jira tickets from a given project that obey these project's criteria, that is: 1- the task is closed, 2- have a bug issue type assigned to it, 3- have logs inside its content (log snippets and/or Stack traces), 4- have a traceable bugfix commit
- getBugReportsList_GithubIssues - Does the same, but for GitHub Issues instead of Jira
- searchCommitOnTEvosRepo - Given a JSON file with the bug reports and its correspondents buggy and bugfix commit, it searches the commits on T-Evos database and outputs the bug-reports in which both commits were found in the base.
- extractCodeCoverageData - Given a folder with T-Evos coverage data for a given project, it calculates details such as the covered lines and the coverage percentage for each file
- analyseCoverage - Based on the coverage data, it makes a coverage analysis in a file-level
- analyseCoverage-2.0. - Based on the coverage data, it makes a coverage analysis in a method-level and in general
- modifiedOchiai - Implements Approach 1 based on Ochiai
- modifiedOchiai2 - Implements the Approach 2 based on Ochiai
- analyseOchiaiOutputs - Based on the outputs from previous scripts (suspiciousness score of each file) and the stack traces file, it generates a rank and calculate the metrics (Precision@N, Recall@N, F1@N, MAP and MRR) for each of them.
- getLogBugInformationDefects4J - Filter the bug reports from defects4j that contain log snippets or stack traces using regex. Get the complete information for each of these bugs. Select only the ones in which the failing tests were actually introduced in the bugfix commit.
- analyseGzoltarFiles - Having the coverage files generated by Gzoltar as an input, it removes the injected failing tests and extracts the coverage in terms of covered lines per file.
- calculateCoveragePercentages - Using the output of the previous script and the start and end lines obtained in Java AST implementation as an input, it calculates the following metrics: Average_coverage_buggy_files, Average_coverage_stack_trace_files, Average_all_files_coverage, Average_buggy_methods_coverage, Average_st_methods_coverage, Pos_first_buggy_method_in_stack_trace
- extractInformation_merged_data - It goes through the bugfix commit diff to get the line numbers for all the added and deleted lines.
- extractMethodsAndFilesFromTheStackTraces - Applies a regex to extract the filename, the method name and the line number of each stack trace entry
The data is structured as follows:
- /commits - Contains the outputs from the scripts getBugReportsList_Jira and getBugReportsList_GithubIssues for all the repos in T-Evos
- /coverage_info - Contains the outputs from the script analyseCoverage
- /coverageMining - Contains the outputs from the script analyseCoverage-2.0.
- /ochiaiScores - Contains the outputs from the script modifiedOchiai
- /ochiaiScores2 - Contains the outputs from the script modifiedOchiai2
- /Rankings - Contains the ranking information for Approach 1, Approach 2, and the Stack Traces entries
- /defects4j - Contains the data extracted from the defects4j bugs
- /data - Contains the merged data of T-Evos and defects4j bugs, having as main file the merged_data_production_bug_reports.json
The JavaAST folder contains an AST implementation that extracts the buggy methods, added methods, new tests and modified tests having the bug information as an input. It is also used to extract the begin and end line numbers from the extract trace methods, that are used to calculate the coverage