# PyCaret

[📚 Source](https://pycaret.gitbook.io/docs/)

PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows. It is an end-to-end machine learning and model management tool that exponentially speeds up the experiment cycle and makes you more productive.

Compared with the other open-source machine learning libraries, PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few lines only. This makes experiments exponentially fast and efficient. PyCaret is essentially a Python wrapper around several machine learning libraries and frameworks such as scikit-learn, XGBoost, LightGBM, CatBoost, spaCy, Optuna, Hyperopt, Ray, and a few more.

The link shared above has got detailed documentation and is an excellent resource to learn about PyCaret. Some common functions of PyCaret are discussed below.

## Setup

This function initializes the training environment and creates the transformation pipeline. Setup function must be called before executing any other function. It takes two mandatory parameters: data and target. All the other parameters are optional. Some of the things that can be done in setup are as follows:

* **Variables:** Define which variables to numerical, categorical or which are the ones to be ignored
* **Missing Value Imputation**
* **One Hot Encoding**
* **Target Imbalance**
* **Outlier Treatment**
* **Scaling and Transformation**
* **Feature Engineering and Selection**

## Train & Optimize

`compare_models` trains and evaluates the performance of all estimators available in the model library using cross-validation. Post model selection you can tune the model by optimizing prpbability threshold, stacking and ensembling models. PyCaret also provides an easy way to analyze your model by using the `evaluate_model` command. A single command which show out Hyperparameters, Plots, Feature importance and many things more.

<figure><img src="/files/SbyCNNFDHYteqvpGs7Io" alt=""><figcaption></figcaption></figure>

The `interpret_model` command analyzes the predictions generated from a trained model. Most plots in this function are implemented based on the SHAP (Shapley Additive exPlanations).

## Predict & Deploy

`predict_model` function generates the output using a trained model. You can also get the raw probability scores for classification, set probability threshold and also monitor data drift. Post finalizing the model you can also easily deploy it in GCP, AWS and Azure.

**As of now PyCaret supports Regression, Classification, Clustering, Anamoly detection, NLP, Association Rule Mining and Time series.**


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://book.thedatascienceinterviewproject.com/machine-learning-frameworks/pycaret.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
