←back to thread

622 points ColinWright | 3 comments | | HN request time: 0.004s | source
Show context
graycat ◴[] No.30079869[source]
Good to hear -- my startup wants to be one more example.

Irony: The opportunity, goal, purpose of my startup is to help people find such Web sites, especially little or focused ones, that they will like. Or the short description of my startup is to help people find Internet content they will like, say, via, roughly search, discovery, recommendation.

A novel part is that the site gets some new data via a likely so far unique iterative, interactive dialog, specific to each use, and then manipulates that data with some math I derived, likely new. A classic, but mostly neglected, advanced pure math result says that, in principle, the iterations should converge to what can be regarded as the right results!

For more, no key words are involved. Can argue that the results give the user the results, content, with the meaning they will like best, i.e., make some progress on working with meaning. So, the problem, the challenge is to respond to the explosion of Internet content, now often from specialized, focused, small Web sites.

For one more -- there are no user IDs, logins, or use of HTTP cookies. In particular, two users who execute the same dialog on the same day (i.e., before I add to the database) get the same results. I.e., for the users there is some relatively good privacy.

Here I'm just taking the opportunity of this OP to describe my solution.

I wrote the Web site code, in .NET, and as far as I can tell the code is ready for significant production. As in the OP and some of the comments already in this thread, my plan is, indeed, to run my own server, using Windows Server, SQL Server, and, for hardware, an AMD FX-8350 at 4.0 GHz. I'm adding data, am not live yet, and have not settled in a domain name yet.

What'd you think?

replies(1): >>30079939 #
marginalia_nu ◴[] No.30079939[source]
That's an interesting take.

Just curious how you plan to make money off this. I think profitability is what has killed most recommendation ventures. It's easy enough to find something that's fun (like StumbleUpon). But they usually eventually self-cannibalize by inserting ads in the recommendations.

On the flip side, if you attempt to do some sort of gated boutique thing where you need a subscription to see the results, you may encounter resistance when you attempt to discover recommendations (assuming you're crawling by yourself, which you'd almost have to I think).

Don't mean to be a downer, but profitability is arguably the hard problem to solve in this space.

replies(1): >>30080533 #
graycat ◴[] No.30080533[source]
Thanks!

Yes, I saw StumbleUpon early on. I just intend and hope to do better pleasing the users.

So, it's recommendation -- maybe the user has heard of the content but wants the quality of the curated results.

Uh, now the Internet has a lot of specialized content (e.g., as in this OP) and, presto, bingo, that means that there are also some specialized AUDIENCES, e.g., audiences the mainstream media (MSM), back to TV, radio, and all the larger newspapers worked hard to ignore -- instead, they went for the mass audience. One general result is, even now, floods of URLs where less than 1% are of interest to any one person. Early on a solution is to pick an audience (of course, one with good demographics), and more generally just to have better means of helping, pleasing users.

It's discovery -- the user has never heard of the content but the dialog is good evidence that they will like the results.

It's search -- the user knows about the content, it's likely famous, will recognize it once can find, see it, but doesn't know where to find it, and it's not easy to find keywords that (accurately) characterize it. Can get some examples out of the fine arts.

The "interactive, iterative dialog", particular to each use, is supposed to be a more powerful, effective way to find, get to, have URLs for content the user will like. Here like is intended to be basically, short for, like the meaning of the content. The user may like the content for entertainment, information, curiosity, etc.

It appears that now both Google and YouTube have seen the basic problem and have their versions of solutions. I'm hoping that my techniques do better where their solutions work poorly or not at all. I'm not trying to replace or compete with where Google, YouTube, Bing, ..., StumbleUpon work well. I'm a sole, solo founder and don't have to be worth billions to be successful.

Yes, my plans are for my site to be ad supported. But I intend actually strongly to follow, say, the old newspaper standard of a wall between (a) the URL results for the users and (b) the ads. E.g., as in this thread of "old Internet", at least early on, the ad targeting is supposed to be maybe from only broadly the demographics or some such of my intended audience, essentially independent of the user or the data from their dialog, or hardly targeted at all.

Right, better ad targeting could yield more revenue, and maybe the dialog data could permit some especially good ad targeting, but NO WAY do I want to have even a hint of giving users URLs that help advertisers. Some such used to be called payola and was made actually illegal.

Or, so far, the code I've written to find the URLs to report to the users has nothing about ads.

Maybe the long term situation would be that for some use instances the dialog data and the reported content URLs would do well at suggesting what ads would be especially effective but no way would ads influence what URLs are reported. Or, the connection between URLs and ads is a one way street: URLs can be used to pick ads, but ads can never pick URLs.

replies(1): >>30080994 #
hellschreiber ◴[] No.30080994[source]
It is very interesting that you describe a system which points people to web content they would like and yet you do not intend to collect data on these visitors. Or did I misunderstand? Wouldn't you need to "get to know" your visitors / customers / users before you are able to show them sites they will like?

Also- I applaud you on the way you plan on serving ads. I do know that there are mechanism by which one can serve relevant ads with high likelihood of these being useful hence clicked and yet without the need to build user profiles.

replies(1): >>30081538 #
1. graycat ◴[] No.30081538{3}[source]
Thanks!

Yup, you bring up some good points.

In the usual senses, approaches of getting to know users, essentially I don't want to do that: (a) Users can regard uses of such data as a threat to, compromise with, and an invasion of privacy -- an important term but ill defined and with possible negative emotional reactions. (b) E.g., what a user did for the past month going deep into the NFL does not mean that they want the NFL for this visit. Instead maybe they want the NBA, a recipe for cheese popcorn, background music for a dinner party with her boyfriend, or an art print to hang over the sofa. Uh, what key words to type in to get an art print will like -- hmm ....

Assuming that their purpose of this visit is much like that of earlier visits can irritate users. (c) Using information on users personally can require cookies, a user ID, their IP address, fingerprinting, etc., all of which can irritate users, i.e., a bad taste in their mouth, a bad user experience. Quite broadly, one way to anger people, to insult them, is to let them know you believe that you have categorized them, that they are so simple that you can pigeonhole them. So, I want no hint of any such thing.

Ah, besides, maybe they are shopping for an art print to give to their daughter for her to hang over the sofa in her new house -- so the visit isn't even about the visitor but about the visitor's daughter! Fine with me!

So, instead of using general, old information about users, the interactive iterative dialog is to get data, some new data, I can manipulate to please the user for their interest that brought them to my site THIS time. The phrasing "Users who give the same dialog on the same day get the same results" is partly to have the user execute the dialog and still be comfortable about privacy.

In simple terms, my current guess is to have a narrow audience, e.g., maybe the BMW set. Then, sure, just from that, Cadillac, Tesla, Toll Brothers, Williams Sonoma, cruise lines, etc. may discover that their ads on my site are a bargain, have a quite good click thru rate. So, that would be targeting just via broad demographics.

replies(1): >>30082000 #
2. hellschreiber ◴[] No.30082000[source]
This sounds extremely interesting and something I personally would love to try! Keep us posted on the progress of this project!
replies(1): >>30082448 #
3. graycat ◴[] No.30082448[source]
Thanks!

I've always intended to announce going live at Hacker News and still plan to. But this OP and thread raise issues so close to my project that here I took the opportunity to outline my work and to ask for reactions. These days I'm gathering initial data, but there is some time to adjust some of the work. Maybe even more important, from the reactions here I can try to refine how I present this project to the public, e.g., so that the public or at least the intended users can understand how this project can help them and for me better to understand this project as others will see it.

This project has been delayed by unpredictable external events independent of the project -- the project work has been fast, fun, and easy while handling the external events has been a pain from the head down to the feet.

HN is likely the world's best collection, concentration of people to make perceptive reactions.