![]() |
|
Spreadsheet/regex questions - Printable Version +- kanji koohii FORUM (http://forum.koohii.com) +-- Forum: Learning Japanese (http://forum.koohii.com/forum-4.html) +--- Forum: Learning resources (http://forum.koohii.com/forum-9.html) +--- Thread: Spreadsheet/regex questions (/thread-13039.html) |
Spreadsheet/regex questions - Thora - 2015-09-24 Hi folks. I'm hoping to separate a block of individual kanji into one kanji per row for use in a spreadsheet Is there a simple way to use regex in Notepad or Open Office to insert a line break after each kanji? After much searching, I came across some indications I might try something like: Find: (.) Replace: \1\n Unfortunately, that doesn't work. I didn't find anything in Open Office's spreadsheet program either. Any solution would have to be simple since I have no idea what I'm doing. Spreadsheet/regex questions - yogert909 - 2015-09-24 what's not working? That should add a line break after every single character... If you want line breaks after just kanji, try changing (.) to ([\u4e00-\u9faf]). It might help better if you post a few lines from your input file and an example of what your desired output looks like. Spreadsheet/regex questions - anotherjohn - 2015-09-24 You could try importing into the spreadsheet first and separating them there ![]() Put the numbers 1 - 9999 or whatever in column A In C1 put the kanji all on one line In B1 put: =mid(c$1,a1,1) and copy down Spreadsheet/regex questions - Thora - 2015-09-24 Thank you both for your quick responses. ![]() yogert909, I just discovered that using Replace: $1\n works in OpenOffice Writer (rather than \1\n). Neither of them worked in Notepad++. It would result in "XE9 [line break]X80 XA2" instead. ?? Thanks for tip about using [\u4e00-\u9faf] for kanji. I did come across mention of using [1-9] and [a-z] for alphanumeric and wondered what would work for kanji. I believe you correctly understood what I was trying to do: From (for eg): 逢芦飴溢茨鰯淫迂厩噂餌襖迦牙廻恢晦蟹 etc to: 逢 芦 飴 溢 茨 鰯 etc Spreadsheet/regex questions - Thora - 2015-09-24 anotherjohn Wrote:In B1 put: =mid(c$1,a1,1) and copy downIt worked (I just needed to use semi-colons). What a useful formula to know. Thanks! Playing around with it a bit , it seems to mean (source cell; starting character position; number of characters). What about converting the formula cells back to regular numbers. Is Special Paste the only/best way to do that? I don't want to turn off autocalculate because I will be using formulas elsewhere to compare columns. Spreadsheet/regex questions - anotherjohn - 2015-09-24 Thora Wrote:It worked (I just needed to use semi-colons). What a useful formula to know. Thanks!You're welcome ![]() Yep, just replace the formulas with values Unfortunately doing so is a bit cumbersome in OpenOffice iirc, which is a serious drawback given how common an operation it is, though there may be a shortcut for it Spreadsheet/regex questions - aldebrn - 2015-09-24 If it's not a super-huge file you want to do this in, you can just use your browser's Javascript Console (Tools -> Web Developer -> Web Console in Firefox, or in Chrome: Settings -> More Tools -> Developer Tools -> Console tab; all computer browsers have this). I just ran: Code: copy('逢芦飴溢茨鰯淫迂厩噂餌襖迦牙廻恢晦蟹'.split('').join('\n'))逢 芦 飴 溢 茨 鰯 淫 迂 厩 噂 餌 襖 迦 牙 廻 恢 晦 蟹 If you wind up needing to do this to a huge file which you can't paste into the web console, you can try venturing into the haunted pleasurelands of Node.js. Spreadsheet/regex questions - aldebrn - 2015-09-24 Thora Wrote:Neither of them worked in Notepad++. It would result in "XE9 [line break]X80 XA2" instead. ??Horrific. Nodepad++ seems to not know Unicode. U+80A2 in Unicode is 肢. Spreadsheet/regex questions - Vempele - 2015-09-25 http://stackoverflow.com/questions/18411903/anyone-know-how-to-use-regex-in-notepad-to-find-arabic-characters Quote:This is happening because Notepadd++ regex engine is PCRE which doesn't support the syntax you have provided. Spreadsheet/regex questions - Thora - 2015-09-25 aldebrn, I've saved your instructions for future reference. Thank you. aldebrn Wrote:If you wind up needing to do this to a huge file which you can't paste into the web console, you can try venturing into the haunted pleasurelands of Node.js.While this sounds somewhat intriguing, I suspect I don't have what it takes to get past the velvet rope. So I shall resign myself to my ordinary little files and try to take pleasure in my new javascript friend. ;-) |