GitHub

Note

The aim of the minishell 42 project is to create a lightweight command-line interpreter that reproduces the essential features of bash. What sets this implementation apart is its robust parsing system, completely decoupled from execution, built on LALR(1) grammar principles, producing a clean and efficient Abstract Syntax Tree (AST) for command execution. This project demonstrates advanced parsing techniques and provides a basis for understanding how some modern shells interpret and execute commands.

✨ Features

🧩 Tokenizer: Flexible and scalable lexical analyzer that converts raw input into meaningful tokens

🔎 Grammar Parser: Predictive parsing using Look-Ahead LR(1) techniques

🔃 AST Generation: Efficient Abstract Syntax Tree construction thanks to grammar production rules

🔗 Efficient Builtins: Implementation of essential shell builtins (cd, echo, exit, etc.)

🧹 Resource Caching: Cached file descriptors and memory allocations with automatic cleanup on program exit

⚡ Hashmap-powered Environment: Fast O(1) environment variables access

📏 42 School Compliant: Follows 42 School norm and coding standards

🚀 Getting Started

Prerequisites

Clang compiler

GNU Make

readline library

Installation

# Clone the repository
git clone --recurse-submodules https://github.com/MykleR/minishell.git

# Enter the directory and compile project
cd minishell; make

# Run the shell
./minishell

Important

Don't forget --recurse-submodules otherwise dependencies will not be cloned

🔍 Technical Overview

What are LR Parsers?

An LR parser is a powerful tool used by interpreters and compilers to analyze the structure of code or commands. "LR" stands for "Left-to-right" reading of the input, building up the parse tree in a way that matches the grammar rules of a language. This type of parser works from the bottom up: it starts with the raw input (like shell commands), gradually groups symbols to form higher-level structures, and ultimately recognizes valid syntax.

The Grammar

This grammar formally describes the language's syntax.

Left side: Productions, used to represent symbols or in our case AST nodes.

Right side: Requirements for the production (these may be tokens or other productions).

program -> list  
list -> list AND list  
list -> list OR list  
list -> list PIPE list  
list -> LBRACKET list RBRACKET  
list -> command  
redirection -> REDIR_IN arg  
redirection -> REDIR_OUT arg  
redirection -> REDIR_APP arg  
command -> arg  
command -> redirection  
command -> command arg  
command -> command redirection  
arg -> ARG

Action and Goto Tables

LR parsers rely on two main sets of instructions, called tables:

Action Table: This table tells the parser what to do next, depending on the current situation. The possible actions are:

Shift: Reads and places the next token from the input onto the stack, gathering more information before reducing to a grammar rule.

Reduce: Replaces gathered symbols on the stack with a single symbol, according to a grammar rule. (e.g., a sequence of tokens words might be reduced to a single "command" symbol)

Accept: Successfully finish parsing. The grammar was fully respected.

Error: Indicate a problem in the input. The grammar was not respected.

Goto Table: After a reduction, this table tells the parser which state to move next, based on the new symbol on top of the stack.

Note

The parser uses a stack to keep track of symbols and parser states. As it shifts tokens and reduces groups of symbols, the stack helps the parser remember where it is and what structures have been recognized so far. Also Actions/Gotos tables are central data structures used in compiler construction—specifically in parsers generated by algorithms like LR parsing.

How It’s Used in minishell

In our minishell project, these action and goto tables are precomputed and built directly into the parsing engine. When the user enters a command, the parser uses these tables to decide what to do for each token—whether to shift, reduce, accept, or signal an error. This setup allows minishell to quickly and reliably understand complex shell command syntax, making it robust and efficient.

Tip

You can generate tables, visualize parse trees and try other grammars with this online tool LALR(1) Parser Generator.

🔄 Processing Pipeline

flowchart LR
    A[Input] -->|Lexer|B(TOKENS)
    B -->|"special tokens" |G(HEREDOC)
    G -->|Parser |C{AST}
    C -->|Execution |C

Input Capture:

GNU Readline for input with history support

Tokenization:

Input string to tokens like redirections '>>', pipes '|' or words

Each token is classified based on its role in the shell language

You will find an exhaustive list of all the tokens type in “headers/lexer.h”.

Token enum values are very important as they are used as index in the action table

Heredoc Processing:

Handles heredocs and converts them to redirections '<' to a temp file

fork the program to allow readline input

AST Construction:

Builds an abstract syntax tree using the LALR(1) parser

As grammar rules are recognized, corresponding AST nodes are created

Nodes are connected to form a tree structure representing the command hierarchy

The tree captures command relationships and execution order

Tree Traversal:

Executes commands through post-order traversal of the binary tree. The AST is traversed in post-order to respect command dependencies.

Nodes are processed according to their type (command, redirection, logical operator)

Execution results propagate up the tree to determine logical branch paths and exit code status

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
headers		headers
lib		lib
sources		sources
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
minishell.png		minishell.png
valgrind.supp		valgrind.supp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

✨ Features

🚀 Getting Started

Prerequisites

Installation

🔍 Technical Overview

What are LR Parsers?

The Grammar

Action and Goto Tables

How It’s Used in minishell

🔄 Processing Pipeline

📚 Further Reading

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

MykleR/minishell

Folders and files

Latest commit

History

Repository files navigation

✨ Features

🚀 Getting Started

Prerequisites

Installation

🔍 Technical Overview

What are LR Parsers?

The Grammar

Action and Goto Tables

How It’s Used in minishell

🔄 Processing Pipeline

📚 Further Reading

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages