Opis
Life scientists today urgently need training in bioinformatics skills. Too many bioinformatics programs are poorly written and barely maintained–usually by students and researchers whove never learned basic programming skills. This practical guide shows postdoc bioinformatics professionals and students how to exploit the best parts of Python to solve problems in biology while creating documented, tested, reproducible software.Ken Youens-Clark, author of Tiny Python Projects (Manning), demonstrates not only how to write effective Python code but also how to use tests to write and refactor scientific programs. Youll learn the latest Python features and toolsâ??including linters, formatters, type checkers, and testsâ??to create documented and tested programs. Youll also tackle 14 challenges in Rosalind, a problem-solving platform for learning bioinformatics and programming.Create command-line Python programs to document and validate parametersWrite tests to verify refactor programs and confirm theyre correctAddress bioinformatics ideas using Python data structures and modules such as BiopythonCreate reproducible shortcuts and workflows using makefilesParse essential bioinformatics file formats such as FASTA and FASTQFind patterns of text using regular expressionsUse higher-order functions in Python like filter(), map(), and reduce() Spis treści:PrefaceWho Should Read This?Programming Style: Why I Avoid OOP and ExceptionsStructureTest-Driven DevelopmentUsing the Command Line and Installing PythonGetting the Code and TestsInstalling ModulesInstalling the new.py ProgramWhy Did I Write This Book?Conventions Used in This BookUsing Code ExamplesOReilly Online LearningHow to Contact UsAcknowledgmentsI. The Rosalind.info Challenges1. Tetranucleotide Frequency: Counting ThingsGetting StartedCreating the Program Using new.pyUsing argparseTools for Finding Errors in the CodeIntroducing Named TuplesAdding Types to Named TuplesRepresenting the Arguments with a NamedTupleReading Input from the Command Line or a FileTesting Your ProgramRunning the Program to Test the OutputSolution 1: Iterating and Counting the Characters in a StringCounting the NucleotidesWriting and Verifying a SolutionAdditional SolutionsSolution 2: Creating a count() Function and Adding a Unit TestSolution 3: Using str.count()Solution 4: Using a Dictionary to Count All the CharactersSolution 5: Counting Only the Desired BasesSolution 6: Using collections.defaultdict()Solution 7: Using collections.Counter()Going FurtherReview2. Transcribing DNA into mRNA: Mutating Strings, Reading and Writing FilesGetting StartedDefining the Programs ParametersDefining an Optional ParameterDefining One or More Required Positional ParametersUsing nargs to Define the Number of ArgumentsUsing argparse.FileType() to Validate File ArgumentsDefining the Args ClassOutlining the Program Using PseudocodeIterating the Input FilesCreating the Output FilenamesOpening the Output FilesWriting the Output SequencesPrinting the Status ReportUsing the Test SuiteSolutionsSolution 1: Using str.replace()Solution 2: Using re.sub()BenchmarkingGoing FurtherReview3. Reverse Complement of DNA: String ManipulationGetting StartedIterating Over a Reversed StringCreating a Decision TreeRefactoringSolutionsSolution 1: Using a for Loop and Decision TreeSolution 2: Using a Dictionary LookupSolution 3: Using a List ComprehensionSolution 4: Using str.translate()Solution 5: Using Bio.SeqReview4. Creating the Fibonacci Sequence: Writing, Testing, and Benchmarking AlgorithmsGetting StartedAn Imperative ApproachSolutionsSolution 1: An Imperative Solution Using a List as a StackSolution 2: Creating a Generator FunctionSolution 3: Using Recursion and MemoizationBenchmarking the SolutionsTesting the Good, the Bad, and the UglyRunning the Test Suite on All the SolutionsGoing FurtherReview5. Computing GC Content: Parsing FASTA and Analyzing SequencesGetting StartedGet Parsing FASTA Using BiopythonIterating the Sequences Using a for LoopSolutionsSolution 1: Using a ListSolution 2: Type Annotations and Unit TestsSolution 3: Keeping a Running Max VariableSolution 4: Using a List Comprehension with a GuardSolution 5: Using the filter() FunctionSolution 6: Using the map() Function and Summing BooleansSolution 7: Using Regular Expressions to Find PatternsSolution 8: A More Complex find_gc() FunctionBenchmarkingGoing FurtherReview6. Finding the Hamming Distance: Counting Point MutationsGetting StartedIterating the Characters of Two StringsSolutionsSolution 1: Iterating and CountingSolution 2: Creating a Unit TestSolution 3: Using the zip() FunctionSolution 4: Using the zip_longest() FunctionSolution 5: Using a List ComprehensionSolution 6: Using the filter() FunctionSolution 7: Using the map() Function with zip_longest()Solution 8: Using the starmap() and operator.ne() FunctionsGoing FurtherReview7. Translating mRNA into Protein: More Functional ProgrammingGetting StartedK-mers and CodonsTranslating CodonsSolutionsSolution 1: Using a for LoopSolution 2: Adding Unit TestsSolution 3: Another Function and a List ComprehensionSolution 4: Functional Programming with the map(), partial(), and takewhile() FunctionsSolution 5: Using Bio.Seq.translate()BenchmarkingGoing FurtherReview8. Find a Motif in DNA: Exploring Sequence SimilarityGetting StartedFinding SubsequencesSolutionsSolution 1: Using the str.find() MethodSolution 2: Using the str.index() MethodSolution 3: A Purely Functional ApproachSolution 4: Using K-mersSolution 5: Finding Overlapping Patterns Using Regular ExpressionsBenchmarkingGoing FurtherReview9. Overlap Graphs: Sequence Assembly Using Shared K-mersGetting StartedManaging Runtime Messages with STDOUT, STDERR, and LoggingFinding OverlapsGrouping Sequences by the OverlapSolutionsSolution 1: Using Set Intersections to Find OverlapsSolution 2: Using a Graph to Find All PathsGoing FurtherReview10. Finding the Longest Shared Subsequence: Finding K-mers, Writing Functions, and Using Binary SearchGetting StartedFinding the Shortest Sequence in a FASTA FileExtracting K-mers from a SequenceSolutionsSolution 1: Counting Frequencies of K-mersSolution 2: Speeding Things Up with a Binary SearchGoing FurtherReview11. Finding a Protein Motif: Fetching Data and Using Regular ExpressionsGetting StartedDownloading Sequences Files on the Command LineDownloading Sequences Files with PythonWriting a Regular Expression to Find the MotifSolutionsSolution 1: Using a Regular ExpressionSolution 2: Writing a Manual SolutionGoing FurtherReview12. Inferring mRNA from Protein: Products and Reductions of ListsGetting StartedCreating the Product of ListsAvoiding Overflow with Modular MultiplicationSolutionsSolution 1: Using a Dictionary for the RNA Codon TableSolution 2: Turn the Beat AroundSolution 3: Encoding the Minimal InformationGoing FurtherReview13. Location Restriction Sites: Using, Testing, and Sharing CodeGetting StartedFinding All Subsequences Using K-mersFinding All Reverse ComplementsPutting It All TogetherSolutionsSolution 1: Using the zip() and enumerate() FunctionsSolution 2: Using the operator.eq() FunctionSolution 3: Writing a revp() FunctionTesting the ProgramGoing FurtherReview14. Finding Open Reading FramesGetting StartedTranslating Proteins Inside Each FrameFinding the ORFs in a Protein SequenceSolutionsSolution 1: Using the str.index() FunctionSolution 2: Using the str.partition() FunctionSolution 3: Using a Regular ExpressionGoing FurtherReviewII. Other Programs15. Seqmagique: Creating and Formatting ReportsUsing Seqmagick to Analyze Sequence FilesChecking Files Using MD5 HashesGetting StartedFormatting Text Tables Using tabulate()SolutionsSolution 1: Formatting with tabulate()Solution 2: Formatting with richGoing FurtherReview16. FASTX grep: Creating a Utility Program to Select SequencesFinding Lines in a File Using grepThe Structure of a FASTQ RecordGetting StartedGuessing the File FormatSolutionGoing FurtherReview17. DNA Synthesizer: Creating Synthetic Data with Markov ChainsUnderstanding Markov ChainsGetting StartedUnderstanding Random SeedsReading the Training FilesGenerating the SequencesStructuring the ProgramSolutionGoing FurtherReview18. FASTX Sampler: Randomly Subsampling Sequence FilesGetting StartedReviewing the Program ParametersDefining the ParametersNondeterministic SamplingStructuring the ProgramSolutionsSolution 1: Reading Regular FilesSolution 2: Reading a Large Number of Compressed FilesGoing FurtherReview19. Blastomatic: Parsing Delimited Text FilesIntroduction to BLASTUsing csvkit and csvchkGetting StartedDefining the ArgumentsParsing Delimited Text Files Using the csv ModuleParsing Delimited Text Files Using the pandas ModuleSolutionsSolution 1: Manually Joining the Tables Using DictionariesSolution 2: Writing the Output File with csv.DictWriter()Solution 3: Reading and Writing Files Using pandasSolution 4: Joining Files Using pandasGoing FurtherReviewA. Documenting Commands and Creating Workflows with makeMakefiles Are RecipesRunning a Specific TargetRunning with No TargetMakefiles Create DAGsUsing make to Compile a C ProgramUsing make for a ShortcutDefining VariablesWriting a WorkflowOther Workflow ManagersFurther ReadingB. Understanding $PATH and Installing Command-Line ProgramsEpilogueIndex
E-informatyka
praca chałupnicza co to, opłata administracyjna leasing a koszty 2021, agencja celna dover, co warto kupić w hiszpanii, ile wynosi kwarantanna dla niezaszczepionych, akson toruń, mazowiecka jednostka wdrażania programów unijnych, prewspółczynnik vat 2020, podatek vat kalkulator, wietnam ludzie, vat na materiały budowlane 2020, operacja południe 2019, portowe biuro krzyżówka, urząd miasta opole wydziały, zaświadczenie z3, wakacje 2019 grecja, energylandia park rozrywki, sikory niższa klasyfikacja, experyment gdynia, msz portugalia, liczba mieszkańców opole
yyyyy