Updated DevGuide (markdown)

2025-10-03 19:29:44 +08:00 · 2022-07-20 10:43:36 -07:00
parent 61841485ee
commit 0149832c28
1 changed files with 37 additions and 26 deletions
--- a/DevGuide.md
+++ b/DevGuide.md
@@ -89,41 +89,36 @@ ShellCheck has multiple output formatters. These take parsing results and output

 Let's say that we have a pet peeve: people who use `tmp` as a temporary filename. We want to warn about statements like `sort file > tmp && mv tmp file`, and suggest using `mktemp` instead.

-We can start by looking at the AST for `sort file > tmp`. In a ShellCheck source directory, run `cabal build` to generate necessary files, then run `cabal repl` and `:load src/ShellCheck/Parser.hs`:
+To get started, clone the ShellCheck repository and run `cabal repl` followed by `:load ShellCheck.Debug`. This is a development module that offers access to a number of convenient methods, helpfully listed in [Debug.hs](https://github.com/koalaman/shellcheck/blob/master/src/ShellCheck/Debug.hs):

 ```
-*ShellCheck.AST> :load src/ShellCheck/Parser.hs
-
-[1 of 7] Compiling Paths_ShellCheck ( /home/james/repos/shellcheck/dist-newstyle/build/x86_64-linux/ghc-8.8.4/ShellCheck-0.7.2/build/autogen/Paths_ShellCheck.hs, interpreted )
-[2 of 7] Compiling ShellCheck.Regex ( src/ShellCheck/Regex.hs, interpreted )
-[3 of 7] Compiling ShellCheck.AST   ( src/ShellCheck/AST.hs, interpreted )
-[4 of 7] Compiling ShellCheck.Interface ( src/ShellCheck/Interface.hs, interpreted )
-[5 of 7] Compiling ShellCheck.Data  ( src/ShellCheck/Data.hs, interpreted )
-[6 of 7] Compiling ShellCheck.ASTLib ( src/ShellCheck/ASTLib.hs, interpreted )
-[7 of 7] Compiling ShellCheck.Parser ( src/ShellCheck/Parser.hs, interpreted )
-Ok, 7 modules loaded.
-*ShellCheck.Parser>
+*ShellCheck.AST> :load ShellCheck.Debug
+[...]
+[16 of 19] Compiling ShellCheck.Analytics ( src/ShellCheck/Analytics.hs, interpreted )
+[17 of 19] Compiling ShellCheck.Analyzer ( src/ShellCheck/Analyzer.hs, interpreted )
+[18 of 19] Compiling ShellCheck.Checker ( src/ShellCheck/Checker.hs, interpreted )
+[19 of 19] Compiling ShellCheck.Debug ( src/ShellCheck/Debug.hs, interpreted )
+Ok, 19 modules loaded.
+*ShellCheck.Debug> 
 ```

-This has given us a REPL where we can call parsing functions. There's a convenient `debugParseScript` function that will take a string and give the complete parser result (minus noisy token positions):
+Now we can look at the AST for our command:

 ```haskell
-*ShellCheck.Parser> debugParseScript "sort file > tmp"
-ParseResult {prComments = [], prTokenPositions = fromList [(Id 0,Position {posFile = "removed for clarity", posLine = -1, posColumn = -1})], prRoot = Just (T_Annotation (Id 1) [] (T_Script (Id 0) "" [T_Pipeline (Id 3) [] [T_Redirecting (Id 4) [T_FdRedirect (Id 10) "" (T_IoFile (Id 11) (T_Greater (Id 12)) (T_NormalWord (Id 13) [T_Literal (Id 14) "tmp"]))] (T_SimpleCommand (Id 5) [] [T_NormalWord (Id 6) [T_Literal (Id 7) "sort"],T_NormalWord (Id 8) [T_Literal (Id 9) "file"]])]]))}
-*ShellCheck.Parser>
+*ShellCheck.Debug> stringToAst "sort file > tmp"
+OuterToken (Id 1) (Inner_T_Annotation [] (OuterToken (Id 15) (Inner_T_Script (OuterToken (Id 0) (Inner_T_Literal "")) [OuterToken (Id 14) (Inner_T_Pipeline [] [OuterToken (Id 12) (Inner_T_Redirecting [OuterToken (Id 11) (Inner_T_FdRedirect "" (OuterToken (Id 10) (Inner_T_IoFile (OuterToken (Id 7) Inner_T_Greater) (OuterToken (Id 9) (Inner_T_NormalWord [OuterToken (Id 8) (Inner_T_Literal "tmp")])))))] (OuterToken (Id 13) (Inner_T_SimpleCommand [] [OuterToken (Id 4) (Inner_T_NormalWord [OuterToken (Id 3) (Inner_T_Literal "sort")]),OuterToken (Id 6) (Inner_T_NormalWord [OuterToken (Id 5) (Inner_T_Literal "file")])])))])])))
 ```

-Alternatively, if we've looked at the unit tests and found the parser responsible for a syntax element, we can use `debugParse` to call it directly:
+(The AST node `T_Literal id str` is an alias for `OuterToken (Id id) (Inner_T_Literal str)`. GHC outputs the latter, unfortunately making it a bit difficult to read. However, with some effort we can see the part we're interested in:

 ```haskell
-*ShellCheck.Parser> debugParse readIoRedirect  "> tmp"
-Right (T_FdRedirect (Id 0) "" (T_IoFile (Id 1) (T_Greater (Id 2)) (T_NormalWord (Id 3) [T_Literal (Id 4) "tmp"])))
+(OuterToken (Id 10) (Inner_T_IoFile (OuterToken (Id 7) Inner_T_Greater) (OuterToken (Id 9) (Inner_T_NormalWord [OuterToken (Id 8) (Inner_T_Literal "tmp")]))))
 ```

-Neither is very pretty, but we can see the part we're interested in:
+This would be equivalent to: (TODO: find a way to format it this way automatically)

 ```haskell
-(T_IoFile (Id 11) (T_Greater (Id 12)) (T_NormalWord (Id 13) [T_Literal (Id 14) "tmp"]))
+(T_IoFile (Id 10) (T_Greater (Id 7)) (T_NormalWord (Id 9) [T_Literal (Id 8) "tmp"]))
 ```

 We can compare this with the definition in `AST.hs`:
@@ -156,13 +151,25 @@ and then append `checkTmpFilename` to the list of node checks at the top of the
    ]
 ```

-Now we can compile and build to see the check apply:
+We can now quick-reload the files with `:r`, and use ShellCheck.Debug's `shellcheckString` to run all of ShellCheck (minus output formatters):

-```sh
-cabal build && dist/build/shellcheck/shellcheck - <<< "sort file > tmp"
+```
+*ShellCheck.Debug> :r
+[...]
+[17 of 19] Compiling ShellCheck.Analyzer ( src/ShellCheck/Analyzer.hs, interpreted )
+[18 of 19] Compiling ShellCheck.Checker ( src/ShellCheck/Checker.hs, interpreted )
+[19 of 19] Compiling ShellCheck.Debug ( src/ShellCheck/Debug.hs, interpreted )
+*ShellCheck.Debug> shellcheckString "sort file > tmp"
+CheckResult {crFilename = "", crComments = [PositionedComment {pcStartPos = Position {posFile = "", posLine = 1, posColumn = 1}, pcEndPos = Position {posFile = "", posLine = 1, posColumn = 1}, pcComment = Comment {cSeverity = ErrorC, cCode = 9999, cMessage = "We found this node: (OuterToken (Id 10) (Inner_T_IoFile (OuterToken (Id 7) Inner_T_Greater) (OuterToken (Id 9) (Inner_T_NormalWord [OuterToken (Id 8) (Inner_T_Literal "tmp")]))))"}, pcFix = Nothing}]}
 ```

-Alternatively, we can run it in interpreted mode, which is often way faster:
+Or alternatively build and run to see the check apply as it would when invoking `shellcheck`:
+
+```sh
+cabal run shellcheck - <<<  "sort file > tmp"
+```
+
+Alternatively, we can run it in interpreted mode, which is almost as quick as `:r`:

 ```sh
 ./quickrun - <<< "sort file > tmp"
@@ -174,7 +181,7 @@ In either case, our warning now shows up:
 In - line 1:
 sort file > tmp
 ^-- SC2148: Tips depend on target shell and yours is unknown. Add a shebang.
-          ^-- SC9999: We found this node: T_IoFile (Id 11) (T_Greater (Id 12)) (T_NormalWord (Id 13) [T_Literal (Id 14) "tmp"])
+          ^-- SC9999: We found this node: (OuterToken (Id 10) (Inner_T_IoFile (OuterToken (Id 7) Inner_T_Greater) (OuterToken (Id 9) (Inner_T_NormalWord [OuterToken (Id 8) (Inner_T_Literal "tmp")]))))
 ```

 Now we can flesh out the check. See `ASTLib.hs` and `AnalyzerLib.hs` for convenient functions to work with AST nodes, such as getting the name of an invoked command, getting a list of flags using canonical flag parsing rules, or in this case, getting the literal string of a `T_NormalWord` so that it doesn't matter if we use `> 'tmp'`, `> "tmp"` or `> "t"'m'p`:
@@ -201,3 +208,7 @@ We can run these tests with `cabal test`, or in interpreted mode with `./quickte
 If we wanted to submit this test, we could run `./nextnumber` which will output the next unused SC2xxx code, e.g. 2213 as of writing.

 We now have a completely functional test, yay!
+
+For any questions like "How do I turn a X into a Y?" like "shell string into an AST" or "AST into a CFG" or "AST/CFG/DFA into a GraphViz representation", see [Debug.hs](https://github.com/koalaman/shellcheck/blob/master/src/ShellCheck/Debug.hs). It's very readable, and includes additional useful development information.
+
+You can also find the ShellCheck author (me) on IRC as `koala_man` in `#haskell@libera.chat`