kanji koohii FORUM
One of my pet peeves about the "search" box on websites... - Printable Version

+- kanji koohii FORUM (http://forum.koohii.com)
+-- Forum: Learning Japanese (http://forum.koohii.com/forum-4.html)
+--- Forum: Off topic (http://forum.koohii.com/forum-13.html)
+--- Thread: One of my pet peeves about the "search" box on websites... (/thread-12168.html)



One of my pet peeves about the "search" box on websites... - john555 - 2014-09-09

...is when you hit search, you are taken outside to Google and the search is done there. Like the website is too cheap to have their own search engine that only searches their own site.

There's a local newspaper that does this and the search results (generated by google) are in random date order.

So if you are searching for the article they did last week about Taylor Swift, for example, it might show articles from two years ago at the top of the list.


One of my pet peeves about the "search" box on websites... - chamcham - 2014-09-09

I don't see why any newspaper would ever want to re-implement their own version of search for their site.

How many competent programmers are working for newspapers?
Could they do better than Google Search?
Can the newspaper afford a programmer with expertise in search technologies (Lucene, Solr, Elasticsearch,etc)?

If they had to hire a programmer, they are more likely to hire the cheapest programmer they could find.
And every newspaper site's search function would be broken in different ways.


One of my pet peeves about the "search" box on websites... - vix86 - 2014-09-09

john555 Wrote:...is when you hit search, you are taken outside to Google and the search is done there. Like the website is too cheap to have their own search engine that only searches their own site.

There's a local newspaper that does this and the search results (generated by google) are in random date order.

So if you are searching for the article they did last week about Taylor Swift, for example, it might show articles from two years ago at the top of the list.
You drastically underestimate how hard it is to create good search algorithms. There's a reason why everyone uses Google for their indexing and search service, Google has been doing it for years now and its what they are the best at.


One of my pet peeves about the "search" box on websites... - Inny Jan - 2014-09-10

vix86 Wrote:You drastically underestimate how hard it is to create good search algorithms. There's a reason why everyone uses Google for their indexing and search service, Google has been doing it for years now and its what they are the best at.
Nah, it's eaaasy. You just need to convince whoever is running the website to get an amply number of pigeons...
http://www.google.com.au/technology/pigeonrank.html


One of my pet peeves about the "search" box on websites... - Nyanda - 2014-09-10

I agree with john555, in certain cases.

I much prefer sites that have a search of their own which can list results by some criteria specific to the site, like date published for an article etc, when I don't care about searching the content of the entire site.

Yes, Google is much better when it comes to a general search of an entire site, but if you are just searching for articles for example, it isn't difficult, in any way shape or form, to search a few columns in a database for some keywords.

It's even easy enough to implement a simple algorithm to match words that might be misspelled, and SQL databases usually even come with a function to search for words that sound like other words out of the box.

It just seems lazy to me when sites use Google for a search box that users aren't expecting to search the entire site.

After all, you wouldn't use Google to create a specific search for flats for rent, or job listings would you?


One of my pet peeves about the "search" box on websites... - vix86 - 2014-09-10

Nyanda Wrote:After all, you wouldn't use Google to create a specific search for flats for rent, or job listings would you?
No, but thats because that kind of search is very specific. Google excels in actual content searches. Google's algorithm doesn't just search for what you put in the search box, it's also smart enough to expand out search terms to synonymous terms. If you google for "querying a db" it expands db to "database." If you search for something like "fix a phone" it also searches for "repairing a phone." Google also has a lot of backend stuff too like caching algorithms, so that highly likely search requests are served up immediately instead of having to perform a full search of the site/database. Google's algorithms are smart enough to know when a page is relevant when all the terms in a search match, instead of returning every result that happens to have every occurrence of your search terms. And it has the server farms to search the entire site's available content for the all the search terms AND some subset of the search terms AND some synonymous set of the search terms. Seriously, imagine you are a news site that hosts over a million news articles with an average of 5k-10k words in an article and you want to do the same kind of search that I just mentioned and STILL be able to return the most relevant results to a user in less than a second. It'd be ridiculous to roll your own search when Google has been building an infrastructure to do it for the past decade.

Even small websites with a few 1000 pages would still benefit from using Google because it means less CPU time running database searches.


One of my pet peeves about the "search" box on websites... - Nyanda - 2014-09-10

So it seems that we actually agree.

For specific searches rolling your own search makes more sense, and for general site wide searches Google makes more sense.

And even when the volume of data is large, if the users wouldn't benefit from a Google search then a specific search still makes more sense in the end, even if using Google would mean less CPU time running searches on your own database.


One of my pet peeves about the "search" box on websites... - Stansfield123 - 2014-09-10

It's a question of a cost/benefit evaluation, not a question of laziness or what a perfect website would do to service every need its users may have.

A newspaper site's function is to provide its users with up to date news, not a news archive. Sure, it would be great if it would also function as an archive, but why would a local newspaper, which odds are barely makes enough money to pay its staff, spend its resources on that?

Prioritizing resource allocation isn't "laziness". Hiring an SQL expert to build a custom search function for a local newspaper would be a poor allocation of resources.


One of my pet peeves about the "search" box on websites... - vix86 - 2014-09-10

Quote:means less CPU time running database searches
Maybe I should have clarified this a little more just in case people aren't familiar with hosting servers.

When a web hosting company needs to decide what to charge you for running your website they can look at a number of metrics but the biggest two are space (how much data you are storing) and cpu time (how long your site tied up the server's cpus). Serving a page is usually a quick read on the hard drive or an equally quick look up in a DB (note: potentially on the scale of microseconds). But a search looking for specific terms can take multiple seconds on a good SQL query (maybe minutes if its bad). So the option to just embed the google search engine in your site isn't just a decision on laziness, its a monetary decision too.

(Server space, ie: how many GBs you use, has gotten dirt cheap. We're talking fractions of fractions of a cent/GB (ie: $0.001/GB). Server time however still remains pricey, sometimes $0.01/hour)

EDIT: Here's an example of what Google charges for web apps, which can be web sites. https://developers.google.com/appengine/pricing

Instance Hours = CPU time doing stuff. $0.05/hour
Data storage costs more on app engine. $0.18 per GB per Month.


One of my pet peeves about the "search" box on websites... - yogert909 - 2014-09-10

And there's something to be said for everyone doing things the same instead of every webpage I got to having a different search with it's own special features. Using your example, if I wanted to find a recent article about Taylor Swift using google, I know that exactly how to limit the search to articles in the last week, month or any other advanced search without having to understand any idiosyncrasies of the developer's whim.