Course talk:CPSC445/2011W2
- [View source↑]
- [History↑]
Contents
Thread title | Replies | Last modified |
---|---|---|
Study group | 1 | 16:58, 14 April 2012 |
A4 Q1b - not too sure what the scope is. Is this a Motif Finding problem or is this a Motif Discovery problem? | 2 | 02:46, 10 April 2012 |
Python for the assignments! | 2 | 02:03, 15 February 2012 |
A1: Question 2.1 | 1 | 02:01, 15 February 2012 |
Discussing topics together and asking each other questions would be a really good way to study for this exam, I think. If Saturday works for everyone who is interested maybe we could meet by Reboot at 10am and study there for the day. And maybe Monday night as well for those of us who have the exam on Tuesday. Any other suggestions welcome. Hope to see you there.
A4 Q1b - not too sure what the scope is. Is this a Motif Finding problem or is this a Motif Discovery problem?
Hey all, Not too sure how to interpret Q1b. The question seems to be asking about finding all instances of 'the' motif, as in, this motif is known, in a given set of DNA sequences. Which is odd, because 'why a set of DNA sequences'? The set of DNA sequences seems to imply that this is a Motif Discovery problem - but it's not a motif discovery problem if the motif is known. ....
Alright - after rereading the questions, I think it could work as a motif discovery problem, if we reverse the order of Q1a and Q1b. Perhaps the question can be interpreted as such: Given a set of DNA sequence, identify the instances of some motif. Now given the instances of these motif, how would you formally define it.
Seeing that this is under the Motif Discovery section - that might just be what the question is asking.
Thoughts?
-A
Ah..no. It's probably not the right interpretation now. They want an algorithm for each definition. Ah now I'm stuck. ..... still open for any thoughts on this though.
Got a reply from Dr. Holger, and the proper interpretation of Q1a and Q1b is that these two questions are really asking for Motif Finding and not Motif Discovery.
Cheers.
Time to crack open this forum!
Python is one of the most commonly used programming languages within Bioinformatics (next to cpp and probably Perl or the likes). And it's quick to write a lot in a few lines that resemble something close to pseudo-code and easy to read.
I'm sure cpp is excellent (and probably faster) but, personally, I have not programmed in cpp before and it can quickly become messy and unnecessary time consuming. Java on the other hand has a tendency to need O(n^2) more lines of code than cpp or python and is in my opinion more suited for large scale program solutions.
Python is excellent for building prototypes that can later be optimized at computational bottlenecks and thus I hereby throw in my vote for (the option of) using python for the next assignments along with cpp and/or java.
Hey, I'm checking with Holger on this one. It does create more work for me since I already need to support solutions for both C++ and Java, but we'll talk about it and I'll get back to you.
I'm not sure what is meant when we're asked to build an FSA and recurrence relation "[...] using the above sequences, [...]". Isn't an FSA (and the recursion tables) generic in the sense that it looks more like figure 2.9 in BSA, potentially with 'e' and 'd' substituted with the given actual penalty scores for this case? Where does the sequences A and B fit into the FSA diagram (I assume we're not talking about something like figure 2.10 in BSA in this particular subquestion)?