Skip to content

This C library provides efficient implementations of linear regression algorithms, including support for stochastic gradient descent (SGD) and data normalization techniques. It is designed for easy integration into your C projects, enabling you to perform regression analysis on various datasets.

License

Notifications You must be signed in to change notification settings

Hemanthsp999/Simple-Linear-Regression-C-Library

Repository files navigation

Simple Linear Regression Library

This project is a lightweight C library designed for simple linear regression. It provides an efficient and flexible way to perform data analysis, enabling users to model relationships between two variables with ease. The library includes functions for data normalization, dataset splitting, and model training, ensuring optimal performance even on systems with limited resources.

Key Features:

  • Efficiency: Optimized for high performance, delivering accurate results with minimal computational overhead.
  • Flexibility: Easily integrates into larger C projects, supporting various dataset formats and workflows.

Perfect for developers seeking a foundational yet robust tool for regression tasks in C.


Table of Contents

  1. Project Structure
  2. How to Use?
  3. How to Debug or Trace Memory Allocation Errors?
  4. File Descriptions
  5. License

Project Structure

Project Root Directory
|-- build
| `-- test
|-- EDA
|   |-- DataAnalysis.c
|   |-- DataAnalysis.h
|-- Regression
| |-- Linear.c
| |-- Linear.h
|-- Test
| |-- test.c
|-- compile_commands.json
|-- License
|-- makefile
|-- README.md
|-- winequality.names
|-- winequality-red.csv
`-- winequality-white.csv

How to use?

1. Include necessary Header files

HeaderFile

Below is the example of how to use this library.

const char *filename = "your_file_name";
int main(){
   
    /* Read the Dataset */
    getFile *read_data = Read_Dataset(filename, "speicfy the Independent_var col", "specify the Dependent_var col");
    
    /* Apply Normalization */
    NormVar *normalize = Normalize(read_data->x, read_data->y, read_data->num_rows);
    
    float train_ration = 0.8, lr = 0.01, lambda1 = 0.05, lambda2 = 0.05;

    /* split the dataset */
    SplitData *split_data = Split_Dataset(normalize->X, normalize->Y, size_x, train_ratio);

    /* fitting model */
    Beta *model = Fit_Model(split_data->X_Train, split_data->Y_Train, split_data->train_size, split_data->train_size, epochs, lr, lambda1, lambda2);
    
    /* Make predictions */
    float *prediction = Prediction_Model(split_data->X_Test, split_data->test_size, *model);
    
    ...

}

The above values are normazlied so its scale in b/n 0 to 1. To Denormalize check below example,

{
    .....
    
    float *denormalize = Denormalize(prediction, normalize->y_min, normalize->y_max, split_data->test_size);
    
    .....
}

To find the models accuracy, use the following methods,

{
    ....
    
    metricResult rmse = Root_Mean_Square_Error(split_data->Y_Test, predictions, size_y);
    metricResult r_squre = R_Square(split_data->Y_Test, predictions, size_y);
    metricResult mse = Mean_Square_Error(split_data->Y_Test, predictions, size_y);
    metricResult mae = Mean_Absolute_Error(split_data->Y_Test, predictions, size_y);

    ....

}

How to Debug or Trace Memory Allocation Errors?

Use GDB for debugging and Valgrind to check for memory issues.

Debugging with GDB:

gdb ./build/test

Refer to the official GDB documentation for more details.

Memory Leak Detection with Valgrind:

valgrind --leak-check=full --track-origins=yes -s ./build/test

Refer to the official Valgrind documentation for further information.


File Descriptions

  • build/test: The compiled binary file generated after running the make command.
  • EDA/DataAnalysis.c and EDA/DataAnalysis.h: Source and header files for exploratory data analysis (EDA) utilities.
  • Regression/Linear.c and Regression/Linear.h: Source and header files for simple linear regression implementation.
  • Test/test.c: Test file to validate and demonstrate library functionality.
  • compile_commands.json: Compilation database for debugging tools like clangd or VSCode.
  • winequality-red.csv and winequality-white.csv: Example datasets for testing and demonstrating the library.

License

The project is licensed under the MIT License. Check the License file in the root directory for more details.


About

This C library provides efficient implementations of linear regression algorithms, including support for stochastic gradient descent (SGD) and data normalization techniques. It is designed for easy integration into your C projects, enabling you to perform regression analysis on various datasets.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published