One of the areas I work in is writing compilers which generate C code as an intermediate step (sometimes known as transpilers), for older languages such as Algol 60, Atlas Autocode, or IMP. I also write static binary translators for video games which convert binaries of old 8-bit microcomputers into C sources for recompilation in order to modify the games or port them to more modern systems.
Sometimes the generated C code is just used as a step in creating a binary, and remains unseen by the programmer, but in other cases you want to generate readable and maintainable C code in order to port the program to a modern system in C, with the intention of maintaining the C translation in the future.
When this happens, it would be nice to reformat the generated C in a readable style, since machine-generated code can often look pretty ugly! Unfortunately the best known source code reformatter, indent, is not the best, so if you are in a similar position and looking for a good reformatter, let me show you the options.
If you're a 'TLDR' person and don't have the patience to read all this, my conclusion is that clang-format from the clang compiler package is the best simple choice currently available.
I will be including some samples of the outputs of each of these formatters on this page for comparison (when I get around to it ;-) ). Just to make them a little easier to read, they'll also be run through a 'c to html' filter, but that will not reformat them at all, it only colourizes the output and ensures that characters such as '<' are readable on any file hosted on this web server. You can remove the .html part from any of those links to see the raw C file.
As well as actual formatting programs, you may need to look at utilities to remove carriage returns from source files and utilities to remove/insert tabs or convert tabs to the appropriate number of spaces. These will not be covered here. Similarly, programs exist to extract procedure headings from .c files to generate .h files, to create Makefiles, and to convert procedure definitions between K&R syntax and ANSI syntax. These also will not be covered here.
Indent is probably the best known of these among old-timey programmers and is almost ubiquitous as it has come as standard in most unix systems since about 1976 when it was a UC Berkeley product. However it has not been updated for some years and does not understand all the modern (C99 etc) C constructs, or the unofficial language extensions supported by gcc (GNU cc) and clang, etc, which is rather ironic as it is now a GNU-maintained program. Since language translation to C can frequently rely on language extensions (such as nested procedure definitions in the Algol languages) this can be a problem, especially in cases where the breakage throws off the formatting for the rest of the file. I did submit a request a few years ago to update it to support nested procedures but I doubt that feature has been added yet as it is a feature used by a tiny minority of C programmers and which I am always afraid might be removed from GCC some day due to 'lack of interest'. (Clang does not support the same style of nested procedures although I think clang-format might.)
You will need to tune
the output of indent to your taste with a profile file named .indent.pro,
which can be per-directory or global. Here are the contents of mine:
-nut -bad -dj -bap -ci1 -cli0.5 -di4 -nbc -nfc1 -i2 -npsl -nsc -br -brf -cdw -ce -sob -fca -l180 -ss --no-tabs
I can point you at two distinct sources of indent — version http://www.gtoal.com/src/c-utils/indent/indent-2.2.9/ which I found on the net, and http://www.gtoal.com/src/c-utils/indent/indent-acorn/ which is the version I used to use in the 80's when I worked at Acorn Computers and which I cleaned up considerably at the time as ANSI C was just coming into its own. I believe both of those will still compile OK on modern systems. The main repository for this version of indent may be https://www.gnu.org/software/indent/, although it appears there's a rival version called cindent at https://invisible-island.net/cindent/cindent.html
cb is roughly contemporary with indent, but much smaller and more lightweight code, which means it is also not as functional. It's worth including in a project in source form as a backup formatting utility if the user is on a system with no installed formatter at all. Source is online here at http://www.gtoal.com/src/c-utils/cb.c
I would not be in the slightest surprised if cb does not handle things like modern "//" C++ style comments.
Archive: c-format.zip A program to format C programs under VMS (Compiled with VAX C v3.2, linked under VMS v5.1) (VMS file attributes saved---use UnZip v5.x+ to unzip) Length Date Time Name --------- ---------- ----- ---- 882 1992-12-21 19:31 aaareadme.txt 98 1992-12-21 19:31 build.com 18088 1992-12-21 19:31 c-format.c 8192 1992-12-21 20:34 c-format.exe 8082 1992-12-21 20:33 c-format.obj --------- ------- 35342 5 files— another short program in the style of cb.
--- Language: Cpp # BasedOnStyle: Google AccessModifierOffset: -1 AlignAfterOpenBracket: Align AlignConsecutiveAssignments: false AlignConsecutiveDeclarations: false AlignEscapedNewlines: Left AlignOperands: true AlignTrailingComments: true AllowAllParametersOfDeclarationOnNextLine: true AllowShortBlocksOnASingleLine: false AllowShortCaseLabelsOnASingleLine: false AllowShortFunctionsOnASingleLine: All AllowShortIfStatementsOnASingleLine: true AllowShortLoopsOnASingleLine: true AlwaysBreakAfterDefinitionReturnType: None AlwaysBreakAfterReturnType: None AlwaysBreakBeforeMultilineStrings: true AlwaysBreakTemplateDeclarations: true BinPackArguments: true BinPackParameters: true BraceWrapping: AfterClass: false AfterControlStatement: false AfterEnum: false AfterFunction: false AfterNamespace: false AfterObjCDeclaration: false AfterStruct: false AfterUnion: false AfterExternBlock: false BeforeCatch: false BeforeElse: false IndentBraces: false SplitEmptyFunction: true SplitEmptyRecord: true SplitEmptyNamespace: true BreakBeforeBinaryOperators: None BreakBeforeBraces: Attach BreakBeforeInheritanceComma: false BreakBeforeTernaryOperators: true BreakConstructorInitializersBeforeComma: false BreakConstructorInitializers: BeforeColon BreakAfterJavaFieldAnnotations: false BreakStringLiterals: true # 80 was unreasonable. 0 might be better. ColumnLimit: 120 CommentPragmas: '^ IWYU pragma:' CompactNamespaces: false ConstructorInitializerAllOnOneLineOrOnePerLine: true ConstructorInitializerIndentWidth: 4 ContinuationIndentWidth: 4 Cpp11BracedListStyle: true DerivePointerAlignment: true DisableFormat: false ExperimentalAutoDetectBinPacking: false FixNamespaceComments: true ForEachMacros: - foreach - Q_FOREACH - BOOST_FOREACH IncludeBlocks: Preserve IncludeCategories: - Regex: '^' Priority: 2 - Regex: '^<.*\.h>' Priority: 1 - Regex: '^<.*' Priority: 2 - Regex: '.*' Priority: 3 IncludeIsMainRegex: '([-_](test|unittest))?$' IndentCaseLabels: true IndentPPDirectives: None IndentWidth: 2 IndentWrappedFunctionNames: false JavaScriptQuotes: Leave JavaScriptWrapImports: true KeepEmptyLinesAtTheStartOfBlocks: false MacroBlockBegin: '' MacroBlockEnd: '' MaxEmptyLinesToKeep: 1 NamespaceIndentation: None ObjCBlockIndentWidth: 2 ObjCSpaceAfterProperty: false ObjCSpaceBeforeProtocolList: false PenaltyBreakAssignment: 2 PenaltyBreakBeforeFirstCallParameter: 1 PenaltyBreakComment: 300 PenaltyBreakFirstLessLess: 120 PenaltyBreakString: 1000 PenaltyExcessCharacter: 1000000 PenaltyReturnTypeOnItsOwnLine: 200 PointerAlignment: Left RawStringFormats: - Delimiter: pb Language: TextProto BasedOnStyle: google ReflowComments: false # are you serious? Sorting includes can break programs! SortIncludes: false SortUsingDeclarations: true SpaceAfterCStyleCast: false SpaceAfterTemplateKeyword: true SpaceBeforeAssignmentOperators: true SpaceBeforeParens: ControlStatements SpaceInEmptyParentheses: false SpacesBeforeTrailingComments: 2 SpacesInAngles: false SpacesInContainerLiterals: true SpacesInCStyleCastParentheses: false SpacesInParentheses: false SpacesInSquareBrackets: false Standard: Auto TabWidth: 8 UseTab: Never ...