Whenever I download a filename with dangerous characters, or receive such a file in an email attachment, I get mildly frustrated. To address this issue, I wrote a Perl script called fix-file-names, which is used to rename such files. The script is given below:
#!/usr/bin/perl # fix-file-names - change file names to safe names, e.g. space to _ etc. # 2009-2020 Vlado Keselj [email protected] http://vlado.ca last update:2020-12-08 # Usage: fix-file-names f1 f2 ... for my $fnold (@ARGV) { my $fnnew = &fix_filename($fnold); if ($fnnew eq $fnold) { print "$fnnew \t\tthe same file name kept!\n" } else { if (-e $fnnew) { die "$fnnew already exists!" } print "$fnold \t-> $fnnew\n"; rename($fnold,$fnnew) or die; } } sub fix_filename { local $_ = shift; s/^-/F-/; s/ +- +/-/g; s/''+/--/g; s/'/-/g; s/[[(<{]/_-/g; s/[])>}]/-_/g; s/[,:;]\s*/--/g; s/&/and/g; s/ /_/g; s/__+/_/g; s/---+/--/g; s/\xE2\x80\x99/-/g; # Single right quote s/[^\w.-]/"0x".uc unpack("H2",$&)/ge; return $_; } # 2020-12-06 # - =HH encoding is replaced with 0xHH since '=' is a special character in # shell (bash)The script first tries to fix various common constructs in filenames to their roughly similar but safe equivalents, and finally it replaces any potentially non-safe character to a hexadecimal
0xHH
code.
The package fix-filenames by Martin Zagora is an interesting open-source program in TypeScript (JavaScript) to fix filenames by recoding some non-ASCII characters. It contains an interesting mapping of non-ASCII characters to ASCII string equivalents. Its GitHub location is https://github.com/zaggino/fix-filenames.