[Reproducible-builds] [GSoC 2016] : Application review
Satyam Zode
satyamzode at gmail.com
Tue Mar 22 19:26:39 UTC 2016
Hi
Jérémy Bobbio :
> Satyam Zode:
>> As far as my research till now is concerned. Brief timeline looks like:
>> Design and Experiment:
>> 1) During the application screening:(March 26 - April 22)
>> 1.1) Acquaint myself with diffoscope and research about proposed features.
>> 1.2) Get hands-on experience with diffoscope.
>> 1.3) Set up the development environment.
>> 1.4) Track changes to the project roadmap in a publicly accessible document.
>> 1.5) Design relevant project design and discuss project design with a
>> community.
>> 2) Community Bonding Period: (April 23 - May 10)
>> 2.1) Interact with the community and exchange information related to
>> project design and working of diffoscope in different conditions.
>> 2.2) Finalizing design and documenting same in the project design wiki.
>> 2.3) Learning more about Debian community.
>
> During that period I think it would be worthwhile to review packages and
> if there's one you see an easy fix, submit patches. That way you would
> get better insights on the various issues and diffoscope limitations.
>
Sure! That sounds good and productive. I will note down this point :-) .
>> Implementation:
>> Official coding period.
>> 3) Week 1 - 2 (May 27 - June 9):
>> - Work on "Allow users to ignore arbitrary differences" part.
>> - Work simultaneously on unreproducible packages.
>
> How much time are you going to give to the community so they can review
> your proposed user interfaces?
>
I think, I will be ready with a design of above by 1st May. After that
till 10th May we can discuss user interfaces because from 11th May I
will have exams so I won't be available for active discussions. If
some things will be remained to discuss then we can always discuss
alongside during a coding period.
>> 4) Week 3 - 4 (June 10 - June 22):
>> - Work on Parallel processing part.
>> - Work simultaneously on unreproducible packages.
>
> This is unlikely to work. Implementing parallel processing requires
> deep focus because it's also about adding missing locks and
> understanding subtle concurrency issues.
>
> How much experience do you have with concurrent programming?
I have good experience with concurrent programming. I have written
many concurrent programs in golang and I believe it'll help me here.
> I think you underevaluate how hard this is to get right. To the very
> least you shoud be entirely focused on this and not fixing packages at
> the same time.
I understand that this is not going to be a piece of cake for me.
However, If we remove fixing of packages from this schedule then I
will get enough time to concentrate on this particular problem. I will
seek help from the community to clear my doubts and will share my
experiences with them. surely, I will try my best. As soon as
application screening period ends, I will start practicing concurrent
programming in python to gain some experience. This will eventually
help me :-). But as I see, the major hurdle is understanding the
requirements and condition of diffoscope(why we need parallel
processing in diffoscope? etc). I am still not completely aware of the
actual problem but once I will get it I will start putting my efforts
to solve it.
>
>> ------------------------------- Mid-Term Evaluations
>> --------------------------------
>>
>> 5) Week 5 - 7 (June 23 - July 13):
>> - Finish remaining work
>> - Start working on fuzzy matching algorithm.
>> 6) Week 8 - 10 (July 14 - August 3):
>> - Finish fuzzy matching algorithm implementation.
>> - Work on new file-format comparators.
>
> diffosope already supports fuzzy matching via TLSH. It's implemented and
> works nicely. But it only does inside a container. That means it will
> not notice when you compare foo.gz and foo.xz that foo might actually be
> the same file. Three weeks for that feels like too much.
>
Yeah! I know three weeks will be too much but I think parallel
processing will take some more time and will consume some more time
after mid-evaluations. As you can see the point "Finish remaining
work" means some remaining work of parallel processing. Later, we can
always utilize the free time to write tests, comparators and fixing
packages :-).
>> 7) Week 11 (August 4 - August 13):
>> - Write tests for implemented features and comparators.
>
> Big no here. Tests should be written prior or during the development of
> the various features. While the code coverage has never been 100%, at
> least the basics should be covered. So please refine the timeline by
> making enough room to write tests during the development.
>
Cool! I will note this point too and will do unit testing after each
feature has been implemented.
>> - keep working on unreproducible Debian packages.
>>
>> Documentation:
>> 8) Week 12 (August 15 - August 22): Suggested pencils down date
>> - Code refactoring.
>> - Finish documentation.
>
> What kind of code refactoring are you thinking about?
>
As per the feedback, I will get from community I will try to fix the
bugs present in code written by me.
> What kind of documentation are you thinking about? Like tests, user
> documentation should be written at the same time or maybe prior as the
> actual features.
>
Agreed!
In my opinion, there must a buffer time in software development
process for any unexpected incidence. Hence, I am planning to keep
this time as a buffer time. What do you think about it ? (Of course, I
will devote this time for community work only).
>
> Sorry if this starts to feel annoying, but I'd like to avoid us making
> mistakes that I've seen several times in the past with other GSoC.
>
I feel curious whenever you ask me to do something or point out my
mistakes because I know your queries can only feed my hungry mind. I
like learning new things and I always like to experience new things.
And trust me, I am enjoying every day with community because every day
I am learning something new :-)
Thanking you!
Satyam Zode
More information about the Reproducible-builds
mailing list