Nilesh Chakraborty
2013-05-09 18:05:52 UTC
Hi everyone,
I have submitted my GSoC 2013 proposal for Creative Commons, to develop a
media fingerprinting library and content-based media retrieval/matching
system. I have discussed the details with Dan, and have a pretty good
understanding of what I need to do.
I will go with either Java or Python for implementing them. OpenCV has
bindings for both of them. Java would be a good choice, considering it's a
bit faster when using with Apache Hadoop, than Python. On the other hand,
Python would make for faster development, and there already exist a few
lsh/minhash implementations like https://github.com/embr/lsh and
https://github.com/go2starr/lshhdc. I can modify them to my own use or
improve them as required. I'd need to write them from scratch if I go with
Java.
If I go with Python, my choice for map-reduce framework would be
Disco<http://discoproject.org/> (a
very compelte map-reduce implemetation with a distributed file system like
hadoop's HDFS) instead of Apache Hadoop. Disco is much lighter than Hadoop
and basically easier to set up.
Please tell me about your opinions. Once we decide on what to use, I shall
begin prototyping.
Cheers,
Nilesh
I have submitted my GSoC 2013 proposal for Creative Commons, to develop a
media fingerprinting library and content-based media retrieval/matching
system. I have discussed the details with Dan, and have a pretty good
understanding of what I need to do.
I will go with either Java or Python for implementing them. OpenCV has
bindings for both of them. Java would be a good choice, considering it's a
bit faster when using with Apache Hadoop, than Python. On the other hand,
Python would make for faster development, and there already exist a few
lsh/minhash implementations like https://github.com/embr/lsh and
https://github.com/go2starr/lshhdc. I can modify them to my own use or
improve them as required. I'd need to write them from scratch if I go with
Java.
If I go with Python, my choice for map-reduce framework would be
Disco<http://discoproject.org/> (a
very compelte map-reduce implemetation with a distributed file system like
hadoop's HDFS) instead of Apache Hadoop. Disco is much lighter than Hadoop
and basically easier to set up.
Please tell me about your opinions. Once we decide on what to use, I shall
begin prototyping.
Cheers,
Nilesh
--
A quest eternal, a life so small! So don't just play the guitar, build one.
You can also email me at contact at nileshc.com or visit my
website<http://www.nileshc.com/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ibiblio.org/pipermail/cc-devel/attachments/20130509/f0caca51/attachment.html
A quest eternal, a life so small! So don't just play the guitar, build one.
You can also email me at contact at nileshc.com or visit my
website<http://www.nileshc.com/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ibiblio.org/pipermail/cc-devel/attachments/20130509/f0caca51/attachment.html