toshiromiballza Wrote:headphone_child Wrote:I'm a web developer and I'd be able to work on this if I'm available for work at the time you're ready to hire someone, but I'm on east coast USA so I'm guessing we wouldn't be able to meet in person. Let me know if you're interested anyway though -- I think it'd be a plus to have this developed by someone with some knowledge of the Japanese language.That would indeed be a plus, as well as save me the time of having to instruct the developer on what this or that is, where this or that should be, etc. I can already imagine the endless phone conversations and additional meetings I'd have to have with the otherwise clueless developer... How much would you say you'd charge for such a project (the flat rate)? I imagine the prices are way higher in the US than where I am (Slovenia).
toshiromiballza Wrote:I haven't scoped everything out, but I agree with ikore's ballpark. 1000 might be enough to get a website without any user-customization, but it also depends on how much of the data needs to be reorganized.ikore Wrote:but this is easily a few months of full-time work (which would make it more into the range of 3000-6000 euro).Wow, seriously? I didn't think it costs that much, or that it takes that long... I was imagining this being done in two weeks tops... Maybe I should learn to code and make a nice living off of that, lol.
Will a website in Python cost more than one done in PHP, by the way? I read some good articles about how it's becoming more and more used for web development, whereas PHP is "getting outdated."
As for learning coding, go for it! Just note - there's more to software development than coding.

As for PHP, it gets a lot of bad rep, but it's been picking up steam again with the excellent Laravel framework, which some people say is the first "good" PHP framework. Of course there's been a movement from pure backend development to more frontend lately, so you could serve the backend as a REST API with Laravel or anything, and consume it with a Javascript MV* framework such as AngularJS. That's probably how I'd do it. That way, when you're ready to open the data to the public, you have a public, consumable REST API with little additional work needed.
toshiromiballza Wrote:Yeah, no need to worry about the rigorous definition of 3NF. What's important is database normalization -- I think that article is a bit easier to understand (again, that's just if you're interested. this is something a good developer should handle for you). The main motivations for this are data integrity (eliminating the possibility of data anomalies) and reducing data redundancy. 3NF actually tends to result in more tables rather than less tables. 23 sounds like a good number, but it'll probably be higher after normalization.headphone_child Wrote:And it depends on how is the data currently stored. Which DBMS? How many tables? Is it 3NF? If not, the tables could need redesigning too. And additional tables are needed for storing user settings as described above.Initially I went with MySQL (because it was the only SQL database I was somewhat familiar with). I soon realized it's not adequate and doesn't even support many of the CJK characters my database consists of... So I googled around and came across SQLite; it was perfect and lightweight, and I think it really is the best option for the job (not sure about all the user-related stuff, though). 23 tables, but I'm sure I could merge some together (e.g. there are two tables for 'shinjitai' and 'kyuujitai', perhaps storing both that information in the 'variants' table would be a better idea, not sure if it really matters speed-wise, though?). No idea what 1NF, 2NF, 3NF is, even after I've read something about it, lol. Also, some of the entries in the tables would require additional Python/PHP magic before being output to the user. For example, here is an example of the 'particles' field: "が(59%),に(30%),を(11%)" Before this is output, Python/PHP should "explode" the entry to separate the actual particles from the brackets, percentages and commas, so that clicking on a particle would load the appropriate sentences by querying the 'sentences' table for that 'particle+word' combination.
The particles field contains too much data for one column, which limits the utility of that data (this particular issue could be fixed by using multiple tables). This is especially a problem if you eventually want to make this data available to the public. If the database is designed to work only for your website, then it will be difficult for others to make use of the data, plus you'll have problems with flexibility if you ever want to make changes to your website. So the database needs to be designed independently of the website, because it's a separate application tier. Again, you don't have to worry about any of this though; there's nothing wrong with the data you've collected. I'm only bringing this up to show that there's more work to be done than what one might expect. Sorry about all the jargon.
toshiromiballza Wrote:Have you seen what JED for Android looks like? It does have scrolling, but it's fine. Of course yours would have much more scrolling, though. Anyway, your concerns are understandable, and I agree this should be low priority anyway.headphone_child Wrote:What could be nice is an app completely separate from the website where you download all the dictionary data onto your device initially when installing the app, so that you can use the app without an internet connection (a la JED for Android). I'd probably use an app like that.I suppose, but this would have so many features, how do you display everything on the small screens of mobile devices, or even tablets? This would require a lot of scrolling or opening different tabs just to get to the part you wanted to see... I'm not entirely sure the app idea is feasible. And remember, this in no way replaces EDICT as a vocabulary dictionary, it's got 12,000+ entries compared to EDICT's 200,000+. So sure, it's a great kanji resource with cool extra bits for all the "official" words and common jukugo, etc., but a lot of people would be disappointed after searching, for example, "こんにちは" or some obscure jukugo, and there would be no results. I mean, I could append the rest of EDICT into the database, but those entries would have no extra features, they would just be as-is. I don't think it's worth it...
toshiromiballza Wrote:OK, if they just suggest minor UI changes, things like different colors, fonts, arrangements, things like that are generally OK (but again, depends on who you hire. some people will charge for any change, no matter how minor). It's actual functionality that would add to the initial costs.headphone_child Wrote:But mainly, the really risky one is "Be a voice in the process" -- you have to be very careful of causing feature creep with this, and it could easily increase the cost of development.Hm, well, I think I covered all the possible features myself already, and there really isn't anything to add! Well, somebody at Reddit did mention it would be nice to have pinyin (and Korean) included, so I guess I'll throw that in too... But I was referring more to the design itself. I mean, it would be better for people to see and comment on the design first, so appropriate changes can be made based on user input, instead of finishing the website and then people complaining this should be changed, this is ugly... Although, if somebody has some great ideas and I can include it, why not. In any case, I think access to the beta (or just preview screenshots) seems like a valid "award."
Edited: 2014-04-28, 5:40 pm
