In what’s never going to be a regular occurance, I’m linking to a Twitter thread. Chris Espinosa tweets:
Just as I was wrapping up an email and getting ready to leave work, a co-worker rolled his chair over to show me an “interesting” thing.
Go ahead, read it.
UNIX, man. Not even once.
The zeitgeist is that the last 2 tweets are about Trump though…
Lo and behold! The Unix wizards at Bell labs foretold in ancient man files of the great baboon sent here to punish in the days when the sun would be eaten by the Night God Naztuli
Chris Espinosa? @cdespinosa Aug 8
Just as I was wrapping up an email and getting ready to leave work, a co-worker rolled his chair over to show me an “interesting†thing. He was in Terminal, looking for documentation on a new feature, and had been told that it only existed as a Unix ‘man’ page. He wanted to email it to somebody so he’d used the Terminal feature to capture output to a file, and had opened it in a text editor, but in the text file the title of the command and certain words in the text had ddoouubblleedd cchhaarraacctteerrss.
And forty years of my life dissolved away as into a mist, like I had nibbled on Proust’s madeleine. And I laughed and cried at the same time (Reply to this post if you know where this is going). Of course, I told him, Terminal is a tty, and man thinks that it’s a Decwriter from 1975. It is printing char-backspace-char to make it bold Terminal simulates the overstrike. But when it’s copied to a file it’s just 0x08 and nonprinting, and you see the doubled characters.
But I remembered, from 1997 when NeXT came in, or 1987 when I managed A/UX, or 1977 when I troffed the Apple II Reference Manual that the man page for man had the instructions for how to stop this. A flag? Redirecting standard out to a non-tty pipe? ‘man man’ it is… And though the man page has changed a lot in 40 years, it was still there. | col -b > filename.txt to strip the doubled characters. With also the magic words “reverse line feed.†Ah! Pinfeed perforated paper on a letter-quality printer!
See, there were these rollers with pins on them that fed the paper. But to tear off the last page the perf needed to be above the pin feed. That left 2†unused so if your top margin was less than that, it needed to inject a Form Feed, then six lines of Reverse Line Feed to print the top of the page but if you were piping to a text file the reverse line feeds would just overprint the last lines on the previous page. col filtered them out.
The memory of years of fussing with pinfeed paper and RIBBONS for God’s sake and top alignment and that in 2017 all that code is still there but the main point is that it being a living document we have no idea of what it originally said without the edit history.
That was a fun, if not entirely surprising read. Having worked as a teletype operator right out of high school (yes, the actual teletype machines with green screens and line printers), this was where I expected it to go.
God I love UNIX.
You have to, because it runs the Internet. 🙂
Yeah, I’m old but I’m not that old. I’ll get off your lawn now. 😉
I wouldn’t consider most of your examples to be “convenient”. I recall piping the output of man to nroff, but that’s a really old memory.
Personally, I find that
man ls > ls.txt
is all I need to do under my current bash environment.
When I do this in my (home) UNIX environment (not Linux, not bash), I get a text file with control characters (the ^H as backspace for “overprinting” as well as the double characters and the _ underlining). Example line:
-^H-D^HD _^Hf_^Ho_^Hr_^Hm_^Ha_^Ht
This is the text found in the text file for
-D format
Reading the file with “less ls.txt” again restores the look of the manpage including highlighted and underlined characters, in this example, “-D” printed bold, and “format” underlined. Outputting it with “cat ls.txt” directly to the terminal does not display the control characters – normal text is written to the terminal. But they are still there – “cat ls.txt | less” leads to the “manpage style” again.
It might be possible that in your environment the control characters are stripped automatically. Maybe your man’s roff implementation (nroff, groff, troff, etc) behaves different, maybe the shell changes $PAGER for the process to something else than the traditional “less” upon redirection…
Gives a correctly formatted text file on Ubuntu 17.04.
I actually expected the behavior you saw– even did hexdump -C to verify there are no hidden control characters.
Interestingly, under FreeBSD, I get the behavior described above– control characters and double-characters for bold text.
Yes, the article is really misleading…
A simple
man to pdf
search in Google showed a simple way to get a pdf file from `man bash`:
man -t bash | ps2pdf – bash.pdf
Edited 2017-08-14 08:16 UTC
What OS is was he using? I’m guessing macOS, since he’s referring to “Terminal”, rather than “the terminal”, or xterm, gnome-terminal, etc etc.
And if that’s the case, well, which new features of macOS has documentation that exists only as a man page?
That’s the real question.
Or maybe IRIX, BeOS, Haiku, Syllable, AtheOS or perhaps SkyOS. 😛
Edited 2017-08-10 01:25 UTC
And none of those have any new features.
Except maybe Haiku.
SkyOS…. I almost forgot about it…
I’m guessing macOS because we’re talking about Chris Espinosa:
https://en.wikipedia.org/wiki/Chris_Espinosa
Which also explains why his shell is so bad it didn’t handle the conversion in the streaming operator like a modern Linux shell.
Check your termcap if it has LA36 or LA180 definitions, it’ll be an interesting read.
Also the caps makes it easier to parse with DEC runoff which is nice for those of us with VT terminals still attached to our VAX and pdp-11’s.
You must be kidding you still have a working PDP-11, right? It was one of the first computers I had access on my uni, like ~33 years ago!
On a side note, I’m not surprise at all things like “double characters” and many other ancient capabilities are still around. On Unix world it used to be “if it is not broken ..”.
Many of us that connect remotely to fix/check things still prefer to use “character” terminals for fast and practical reasons and in this world esc, \b, \r and other special chars and sequences are still king, and fun to play with.
Yep. It’s like classic cars, plenty of us aa^Hdd^Hdd^Hii^Hcc^Htt^Hss^H enthusiasts out there that still run them. Granted you can emulate faster than real hardware, but part of the charm is restoring, running and operating these old girls. I have a few. Once you’ve moved in the circles, pdp-11 owners come out of the woodwork. Plenty around
There’s even modern extensions now days such as Bilquist’s BQTCP, Telnet, FTP, and web server (BQHTTP) for them. The old girls continue to run and be enjoyed. If you just want to emulate one, try simh as a starting point and see if you catch the bug. If you do, a cheap DECServer550 with the resistor and ROM hack is a cheap and easy entry-point.
I usually use something like:
man -t man | lp
Tell this your coworkers…
they wanted to email the output, in such case i would use
man man > file.txt
That is quite simple, just use the power of UNIX:
% man rm | mail -s “How to add a remark to files” [email protected]
😉
Nobody commenting on why he’s using Twitter to discuss man overprinting? With the char limit, it ended up as 20 tweets in all – wrong social media platform methinks 🙂
rklrkl,
Yep, your absolutely right. It’s the “in” social media platform, but for technical merit it leaves a lot to be desired. I lean towards more in depth discussion with good amounts of context, whereas twitter effectively undermines all of that.
On the other hand, I do think it’s the ideal platform for anyone with an aversion to detail, like presidents who like to talk alot yet actually have very little to say.
Back in the day I helped convert a huge documentation set from roff format (so, man pages) to Word for DOS via some parser that converted the roff to RTF.
man’s ecosystem has always been awful, which is why there have been so many attempts to replace it (POD, for example).
$ man “not even once”
No manual entry for not even once
$
As other have said, the terminal being used isn’t handling the overstrikes. With that said, in their raw form, the roff man pages can be formatted for output proper to your end device. Shoot, you can even have it output HTML (with a few hiccups… depending on what your man supports).
Dare I even say, man col (if this an old Unix host especially)
This is the first time in many, many years that I have gone to Twitter to read anything.
Ok, so perhaps I am an old fart, but WTF?!?
Does everyone read crap like this?
I mean, the story is not crap. Maybe. Dunno.
But having to read it in extremely small chunks.
It is.
Kind of.
Annoying.
Really.
Damn.
I couldn’t make it.
Edited 2017-08-10 23:46 UTC
number9,
twitter is the geocities of millennials.
I think Geocities would be offended at the comparison
Not being a twit
reading that
is a bloody pain!