Adding Quadratic Structure to SAC
sac.py contains the baseline sac implementation.
sac_lqr.py is the version after adding quadratic structure.
curl -LsSf https://astral.sh/uv/install.sh | shImportant: Restart your terminal after installation.
git clone https://github.com/jacoblgit/actorcritics.git
cd actorcritics# Core dependencies
uv sync
# Development tools (ruff linter/formatter)
uv sync --extra devuv run python sac_lqr.py --help
uv run python sac_lqr.py --env-id InvertedPendulum-v4 Use max_eigenvalue (corresponding to constraining the max eigen value to stablize runs) with coef 0.001 for combined run
uv run python sac_lqr.py (... other flags) --loss-type max_eigenvalue --loss-coef 0.001For more details, checkout:
uv run tensorboard --logdir runs/# Check code
uv run ruff check
# Format code
uv run ruff formatuv add <package-name>Environment setup based on Modern Python Practices
Baseline SAC Implementation is CleanRL SAC
Info on SAC: Soft Actor-Critic — Spinning Up documentation
Info on LQR: Ch. 8 - Linear Quadratic Regulators