Writing in the Terminal¶
So, I have been writing with plain text files for a while. I’ve plaid around with HTML and LaTeX extensively. Once I discovered the joys of vim, I slowly began to do all of my editing within a terminal emulator. Initially, I worked directly with the before mentioned markup languages. Recently, I learned about Pandoc and that let me give up on HTML by using Markdown, but I still wrote my scientific papers directly with LaTeX. Now I think it is time to make the full switch.
Writing with Markdown¶
Note
I no longer really use or recommend using Markdown to do serious writing (as of 2020-02-02). Ultimately, Markdown is good for short notes or very simple documents, but I kept finding myself writing filters to actually get Pandoc to do what I need it to do. To add insult to injury, Pandoc is not a full LaTeX parser. I kept finding myself having to work around its limitations by inserting LaTeX directly into the file which defeats the purpose. My solution has been to just use LaTeX directly (or Word when I have to, although I will write a draft in Markdown and use Pandoc to do the initial conversion).
Note
I have also started using reStructuredText more since it is more emenable to extension.
So, I have come to the realization that LaTeX is not the best way to
write. By the looks of it, I am not the first or second, and I
probably won’t be the last. Thankfully, with the emergence of
CommonMark, I can possibly make some head way. The beauty of Markdown
is the readability of the source file. I am a big fan of splitting
content and format. I already live in the command line so the plain
text files makes my life easier. By using pandoc
, I can convert to
HTML, PDF, or Word, but the challenge is getting the citations in place.
This documents my path to getting the citations to work with pandoc
.
Installation¶
The first order of business is to get things installed. I’m on OS X, so
I use MacPorts. To get pandoc
we simply call:
$ sudo port install pandoc
According to the manual, to add a bibliography, I need
pandoc-citeproc
, but it does not appear to be in MacPorts. Rooting
around on the repository, it appears that this has been recently
removed from pandoc
. This would explain why it is not evident. So,
we go directly to the Haskell method:
$ sudo port install haskell-platform # To get cabal
$ cabal install pandoc-citeproc
And then simply add ~/.cabal/bin
to your PATH
.
Running pandoc
¶
Now to try to make a test case:
$ cat test.text
A Title
=======
And they said stuff @narayanasway_thermal_2008.
$ cat reference.bib
@article{narayanaswamy_thermal_2008,
title = {Thermal near-field radiative transfer between two spheres},
volume = {77},
url = {http://link.aps.org/doi/10.1103/PhysRevB.77.075125},
doi = {10.1103/PhysRevB.77.075125},
number = {7},
journal = {Physical Review B},
author = {Narayanaswamy, Arvind and Chen, Gang},
month = feb,
year = {2008},
pages = {075125},
}
$ pandoc test.text -o test.pdf \
> --bibliography=./reference.bib --csl=physical-review-b.csl
And this gives me a lovely PDF! Granted, none of this is really new.
Customizing the Output¶
Now, what if I want to use a specific format such a that for PRB? I
could possibly use the --reference-docx=FILE
option if the journal
offers that; however, PRB does not. The -H FILE
flag will be useful
for adding in the preamble if necessary. I think I might be able to use
the .sty
file from the publisher, but that won’t let us set the
options to documentclass
. I think the ultimate answer will be to
generate the .latex
and then hand hack it to set the document class
correctly.:
$ pandoc --standalone test.text -o test.latex -H preamble.tex
$ vim test.latex # Fix the document class
$ pdflatex test
$ bibtex test
$ pdflatex test
$ pdflatex test
And now I should have a nice PDF in the desired format!
At the time of writing this (2014-09-17), I have not actually used this part as a work flow. I really hope this works.
Editing on the Command Line¶
Now that I have a way to write neat files, I need to worry about
editing. This derives from my attempt to create an editing work flow
for my thesis. I read through the silly thing and marked it up with my
trusty red pen. I was then stuck with how to automate searching for
misspellings and blatant grammatical errors. As I mentioned above, at
this point I was still using LaTeX so I had a bunch of plain text files
that I didn’t really want to convert to something like Word just to get
the automated grammar and spelling checkers. A web search turned up
this discussion and one of the answers was the Writer’s Workbench.
The specific tools that looked promising are aspell
and diction
.
And the bonus is that both are on MacPorts!
Working with aspell
¶
Here is the real trick: how do I get spell checking to work
intelligently? According to the info
page, it is as simple as:
$ aspell check file.text
However, the trick is the author names and technical jargon gets in the way. I could just add each one to the dictionary; however, I don’t really want to pollute a personal dictionary with project specific words that I may never use again. It’s one thing if my name is in there, but often, many names I will never reference again. So, I need a project specific list of words.
First, we tackle the author names. I am using Zotero to manage my
bibliography. This allows me to keep organized and has an option to
export to a BibTex file! It is plain text, so it can easily be parsed
to find the author names. We just have to look for the author
fields and parse what is within the {}
. The trick was splitting on
the and
token. A blind tokenizing on and
leads to breaking the
occasional name. The workaround was to split at the spaces and collect
the full names together. Care was needed when a name contained and
,
but it was manageable. I made wrote a script ztbib2spl
for future
use and put it in my dotfiles scripts folder.
Now we have a word list of author names to pass to aspell
to ignore!
Reading through the documentation, we first need to clean the list of
bad characters:
$ echo 'personal_ws-1.1 en 0' > authornames.en.pws
$ ztbib2spl reference.bib > authornames.text
$ aspell --lang=en --local-data-dir . clean < authornames.text >> authornames.en.pws
$ aspell --personal=./authornames.en.pws check file.text
The first line above is easy to over look. I was banging my head
against the wall for a while until I got it right. I went through a lot
of effort to get a binary version or a dictionary. That
personal_ws-1.1 en 0
indicates that this is a personal word list for
the English language dictionary and the number just needs to be there.
According to the info
page, it is supposed to be the number of
words, but even the documentation says the value does not matter. Don’t
miss the --data-dir .
flag! That tells aspell
to keep the file
in the current directory.
Checking the grammar¶
After we have run the spell checker, we can check for blatantly bad
grammar with diction
. The useful way to do this is to get the
suggestions with the -s
flag and pass it to the pager. I find it
best to start at the end of the file and work backwards. This preserves
the line numbers of the diction
output so that I can find the
sentences. The diction
tool only looks for blatantly bad grammar,
so we still need to do the manual revisions as well. When doing the
revisions, it is best to work with either diction
or the hand
written notes at one time. When trying to do both, it becomes hard to
keep track of where you are.
Conclusion¶
In this note, I have outlined a way to work within the command line to edit plain text files. The basic work flow is:
Run
aspell
Run
diction
Print and hand annotate
Run
diction
just to be sure (or not)Run
aspell
Repeat as necessary.
This allows me to remain comfortably in the terminal and working with plain text files.