Projects
Project
A Text Analysis Paradigm for Enhancing Software Developers Productivity
Number
LEQSF(2015-18)-RD-A-07
Program
Louisiana Board of Regents RCS program
Summary
Contemporary software engineering tools employ Natural Language Processing (NLP) techniques and Information Retrieval (IR) methods for automated support. Such methods exploit the semantic and syntactic knowledge embedded in the textual content of source code to discover important information about the system. Such information can then be utilized in several essential software engineering activities such as traceability, refactoring, and reverse engineering. However, as software evolves, new and inconsistent terminology gradually finds its way into the project, leading the textual content of source code to drift to the unnatural side. Furthermore, source code is highly repetitive, often homogeneous, and suffers from data sparsity and vocabulary mismatch problems. Therefore, applying NLP and IR methods to source code without adjustment can be detrimental. Motivated by these observations, in this proposal we suggest a novel text-processing paradigm adjusted for software. Our main objectives are (1) to introduce an effective, scalable, and computationally-efficient paradigm for processing and analyzing the textual content of source code, and (2) to integrate the proposed paradigm in working prototypes that provide support for several essential software engineering activities. To achieve our objectives, we will conduct a series of analytical experiments, using industrial software systems, to establish the main constructs of our paradigm. Furthermore, sets of human studies will be conducted to assess the usability and effectiveness of our proposed tools. The broader significance of this research arises from the economical impact of the design and development of software engineering tools that enhance software developers’ productivity and ability to produce high-quality software.
Project
EAGER: Statistical Modeling of Linguistic Change in Open Source Software
Number
CCF:1821525
Program
National Science Foundation: NSF
Summary
The project explores a theory of open source software (OSS) evolution based on statistical natural language processing techniques. Based on the emerging recognition that software code is, in many ways, as "natural" as natural language (e.g., English), there is a trend to apply statistical models for software development tasks such as code analysis, comprehension, and programmer support. This grant extends the "naturalness of code" theory by studying how the code lexicon evolves in open source software as different developers work on a software project and features are added, modified, deleted. The goal is to learn the extent to which the evolution of a developer's lexicon follows the laws of natural language evolution. To create the needed demonstration, large datasets of code lexicons are being collected from a large number of OSS projects and their revisions (on GitHub and SourceForge). The main constructs of the frequency model of natural language evolution will be applied to track and identify the main patterns of language changes (e.g., birth, propagation, death of terms in the lexicon) throughout OSS projects life cycle. Part of the challenge is to better understand how events that instigate code evolution, such as maintenance activities and team formation, are fundamentally different from the events that instigate change in natural language, such as war and migration. The research should lead to new ways to predict software project outcomes and to improve software productivity and quality. The project will make available the data, tools, and algorithms that will be produced by the project, which will support future work to understand the dynamics of code evolution in open source software ecosystems.
Project
FRG-E: Building an Infrastructure for Open Source Software Development in Louisiana
Number
PG 990011
Program
LSU Emerging Research Grants
Students
Summary
This project investigates the OSS movement and its potential contribution to the economic development of the State. In particular, the innovation of this project lies in building an infrastructure for establishing and sustaining an active relation between academia and the local software industry through the OSS link. This infrastructure will include training programs, course modules, and software tools that are designed to prepare the new generation of software developers in Louisiana.
Project
EDA: 1. Software Engineering in the Mobile Era
Number
------
Program
Economic Developement Enhancment
Summary
Under this project, we seek to develop novel systematic solutions to fundamental requirements engineering problems in the context of mobile app development. These solutions incorporate methods of requirements modeling, data mining, and domain engineering to identify pervasive user concerns in the mobile app market. A user concern can be defined as any direct or indirect, technical or untechnical, behavior of the app that might impact its users’ experience or their overall wellbeing. These concerns are extracted from different channels of mobile user feedback, such as app store reviews and social media and modeled using formal feature diagrams and user-goal interdependency graphs. These models are intended to act as blueprints for sustainable app design, enabling app developers to identify and prioritize important user concerns in the mobile app market and develop adaptive release engineering strategies that can address these concerns in an effective and timely manner.
RECENT NEWS