Updated DevGuide (markdown)

Vidar Holen
2022-07-20 10:43:36 -07:00
parent 61841485ee
commit 0149832c28

@@ -89,41 +89,36 @@ ShellCheck has multiple output formatters. These take parsing results and output
Let's say that we have a pet peeve: people who use `tmp` as a temporary filename. We want to warn about statements like `sort file > tmp && mv tmp file`, and suggest using `mktemp` instead.
We can start by looking at the AST for `sort file > tmp`. In a ShellCheck source directory, run `cabal build` to generate necessary files, then run `cabal repl` and `:load src/ShellCheck/Parser.hs`:
To get started, clone the ShellCheck repository and run `cabal repl` followed by `:load ShellCheck.Debug`. This is a development module that offers access to a number of convenient methods, helpfully listed in [Debug.hs](https://github.com/koalaman/shellcheck/blob/master/src/ShellCheck/Debug.hs):
```
*ShellCheck.AST> :load src/ShellCheck/Parser.hs
[1 of 7] Compiling Paths_ShellCheck ( /home/james/repos/shellcheck/dist-newstyle/build/x86_64-linux/ghc-8.8.4/ShellCheck-0.7.2/build/autogen/Paths_ShellCheck.hs, interpreted )
[2 of 7] Compiling ShellCheck.Regex ( src/ShellCheck/Regex.hs, interpreted )
[3 of 7] Compiling ShellCheck.AST ( src/ShellCheck/AST.hs, interpreted )
[4 of 7] Compiling ShellCheck.Interface ( src/ShellCheck/Interface.hs, interpreted )
[5 of 7] Compiling ShellCheck.Data ( src/ShellCheck/Data.hs, interpreted )
[6 of 7] Compiling ShellCheck.ASTLib ( src/ShellCheck/ASTLib.hs, interpreted )
[7 of 7] Compiling ShellCheck.Parser ( src/ShellCheck/Parser.hs, interpreted )
Ok, 7 modules loaded.
*ShellCheck.Parser>
*ShellCheck.AST> :load ShellCheck.Debug
[...]
[16 of 19] Compiling ShellCheck.Analytics ( src/ShellCheck/Analytics.hs, interpreted )
[17 of 19] Compiling ShellCheck.Analyzer ( src/ShellCheck/Analyzer.hs, interpreted )
[18 of 19] Compiling ShellCheck.Checker ( src/ShellCheck/Checker.hs, interpreted )
[19 of 19] Compiling ShellCheck.Debug ( src/ShellCheck/Debug.hs, interpreted )
Ok, 19 modules loaded.
*ShellCheck.Debug>
```
This has given us a REPL where we can call parsing functions. There's a convenient `debugParseScript` function that will take a string and give the complete parser result (minus noisy token positions):
Now we can look at the AST for our command:
```haskell
*ShellCheck.Parser> debugParseScript "sort file > tmp"
ParseResult {prComments = [], prTokenPositions = fromList [(Id 0,Position {posFile = "removed for clarity", posLine = -1, posColumn = -1})], prRoot = Just (T_Annotation (Id 1) [] (T_Script (Id 0) "" [T_Pipeline (Id 3) [] [T_Redirecting (Id 4) [T_FdRedirect (Id 10) "" (T_IoFile (Id 11) (T_Greater (Id 12)) (T_NormalWord (Id 13) [T_Literal (Id 14) "tmp"]))] (T_SimpleCommand (Id 5) [] [T_NormalWord (Id 6) [T_Literal (Id 7) "sort"],T_NormalWord (Id 8) [T_Literal (Id 9) "file"]])]]))}
*ShellCheck.Parser>
*ShellCheck.Debug> stringToAst "sort file > tmp"
OuterToken (Id 1) (Inner_T_Annotation [] (OuterToken (Id 15) (Inner_T_Script (OuterToken (Id 0) (Inner_T_Literal "")) [OuterToken (Id 14) (Inner_T_Pipeline [] [OuterToken (Id 12) (Inner_T_Redirecting [OuterToken (Id 11) (Inner_T_FdRedirect "" (OuterToken (Id 10) (Inner_T_IoFile (OuterToken (Id 7) Inner_T_Greater) (OuterToken (Id 9) (Inner_T_NormalWord [OuterToken (Id 8) (Inner_T_Literal "tmp")])))))] (OuterToken (Id 13) (Inner_T_SimpleCommand [] [OuterToken (Id 4) (Inner_T_NormalWord [OuterToken (Id 3) (Inner_T_Literal "sort")]),OuterToken (Id 6) (Inner_T_NormalWord [OuterToken (Id 5) (Inner_T_Literal "file")])])))])])))
```
Alternatively, if we've looked at the unit tests and found the parser responsible for a syntax element, we can use `debugParse` to call it directly:
(The AST node `T_Literal id str` is an alias for `OuterToken (Id id) (Inner_T_Literal str)`. GHC outputs the latter, unfortunately making it a bit difficult to read. However, with some effort we can see the part we're interested in:
```haskell
*ShellCheck.Parser> debugParse readIoRedirect "> tmp"
Right (T_FdRedirect (Id 0) "" (T_IoFile (Id 1) (T_Greater (Id 2)) (T_NormalWord (Id 3) [T_Literal (Id 4) "tmp"])))
(OuterToken (Id 10) (Inner_T_IoFile (OuterToken (Id 7) Inner_T_Greater) (OuterToken (Id 9) (Inner_T_NormalWord [OuterToken (Id 8) (Inner_T_Literal "tmp")]))))
```
Neither is very pretty, but we can see the part we're interested in:
This would be equivalent to: (TODO: find a way to format it this way automatically)
```haskell
(T_IoFile (Id 11) (T_Greater (Id 12)) (T_NormalWord (Id 13) [T_Literal (Id 14) "tmp"]))
(T_IoFile (Id 10) (T_Greater (Id 7)) (T_NormalWord (Id 9) [T_Literal (Id 8) "tmp"]))
```
We can compare this with the definition in `AST.hs`:
@@ -156,13 +151,25 @@ and then append `checkTmpFilename` to the list of node checks at the top of the
]
```
Now we can compile and build to see the check apply:
We can now quick-reload the files with `:r`, and use ShellCheck.Debug's `shellcheckString` to run all of ShellCheck (minus output formatters):
```sh
cabal build && dist/build/shellcheck/shellcheck - <<< "sort file > tmp"
```
*ShellCheck.Debug> :r
[...]
[17 of 19] Compiling ShellCheck.Analyzer ( src/ShellCheck/Analyzer.hs, interpreted )
[18 of 19] Compiling ShellCheck.Checker ( src/ShellCheck/Checker.hs, interpreted )
[19 of 19] Compiling ShellCheck.Debug ( src/ShellCheck/Debug.hs, interpreted )
*ShellCheck.Debug> shellcheckString "sort file > tmp"
CheckResult {crFilename = "", crComments = [PositionedComment {pcStartPos = Position {posFile = "", posLine = 1, posColumn = 1}, pcEndPos = Position {posFile = "", posLine = 1, posColumn = 1}, pcComment = Comment {cSeverity = ErrorC, cCode = 9999, cMessage = "We found this node: (OuterToken (Id 10) (Inner_T_IoFile (OuterToken (Id 7) Inner_T_Greater) (OuterToken (Id 9) (Inner_T_NormalWord [OuterToken (Id 8) (Inner_T_Literal "tmp")]))))"}, pcFix = Nothing}]}
```
Alternatively, we can run it in interpreted mode, which is often way faster:
Or alternatively build and run to see the check apply as it would when invoking `shellcheck`:
```sh
cabal run shellcheck - <<< "sort file > tmp"
```
Alternatively, we can run it in interpreted mode, which is almost as quick as `:r`:
```sh
./quickrun - <<< "sort file > tmp"
@@ -174,7 +181,7 @@ In either case, our warning now shows up:
In - line 1:
sort file > tmp
^-- SC2148: Tips depend on target shell and yours is unknown. Add a shebang.
^-- SC9999: We found this node: T_IoFile (Id 11) (T_Greater (Id 12)) (T_NormalWord (Id 13) [T_Literal (Id 14) "tmp"])
^-- SC9999: We found this node: (OuterToken (Id 10) (Inner_T_IoFile (OuterToken (Id 7) Inner_T_Greater) (OuterToken (Id 9) (Inner_T_NormalWord [OuterToken (Id 8) (Inner_T_Literal "tmp")]))))
```
Now we can flesh out the check. See `ASTLib.hs` and `AnalyzerLib.hs` for convenient functions to work with AST nodes, such as getting the name of an invoked command, getting a list of flags using canonical flag parsing rules, or in this case, getting the literal string of a `T_NormalWord` so that it doesn't matter if we use `> 'tmp'`, `> "tmp"` or `> "t"'m'p`:
@@ -201,3 +208,7 @@ We can run these tests with `cabal test`, or in interpreted mode with `./quickte
If we wanted to submit this test, we could run `./nextnumber` which will output the next unused SC2xxx code, e.g. 2213 as of writing.
We now have a completely functional test, yay!
For any questions like "How do I turn a X into a Y?" like "shell string into an AST" or "AST into a CFG" or "AST/CFG/DFA into a GraphViz representation", see [Debug.hs](https://github.com/koalaman/shellcheck/blob/master/src/ShellCheck/Debug.hs). It's very readable, and includes additional useful development information.
You can also find the ShellCheck author (me) on IRC as `koala_man` in `#haskell@libera.chat`