Wow, this one was pretty tough! I’ve just finished my MSc and started to work at my first job as a data scientist. And I thought that teaching other is much simpler than being taught. That was the biggest mistake I’ve ever made. Amount of work to prepare lectures and homework was really a challenge for me, so I spent almost all weekends during spring 2018, preparing lecture notes and reviewing homework.

Here a couple of lessons I have learned from this course:

  1. I have selected C# and .NET as the environment for course assignments, but my students were barely able to comprehend it. I can agree, that they should learn it even if it more complex than Python; but I also already bored of it after 5 years of .NET development. I guess I will use Python or Go for the next year assignments.
  2. Lectures are really much more interactive when you mix them with ad hoc practical assignments. On the lecture about aggregation, I have prepared a Kaggle dataset and we have a lot of fun querying it for different data slices.
  3. I prepared a functional test to check students solutions automatically, yet I still needed to clone their code and start the test manually. This was a big overhead on these few actions. I will use Travis or something like that to automate this process next year.
    • Also, I made the test available for students and they really overused it. Many of them didn’t even bother to test their solutions manually.

Lecture notes

  1. Course overview

  2. DBMS 1
    • DBMS vs filesystem
    • Data abstraction
    • Physical independence
    • Relational data model
  3. Query 1
    • Data manipulation languages
    • SQL: SELECT, FROM, AS, WHERE, ORDER BY, LIMIT
    • Relation algebra: selection, projection
  4. Modeling 1
    • Data definition language
    • Conceptual vs logical vs physical data models
    • SQL data types
    • Constraints
    • Keys
    • SQL: CREATE
  5. Writing 1
    • SQL: INSERT, UPDATE, DELETE.
  6. Application 1
    • System.Data: IDbConnection, IDbCommand, IDbReader, IDbDataAdapter
    • ADO.NET
  7. Not Only SQL 1
    • Why NoSQL: impendance mismatch, speed, big data
    • Why SQL: single language, static typing, integrity control
  8. DBMS 2
    • Backups
  9. Modeling 2
    • Associations
      • One-to-many
      • Many-to-many
      • One-to-one
    • Foreign key
  10. Query 2
    • Cartesian product
    • Join: inner, left, right, outer
    • Views
  11. Application 2
    • ORM
    • Dapper.NET
    • Linq2SQL
    • EntityFramework
  12. DBMS 3
    • Migrations
  13. Query 3
    • Aggregation functions
    • GROUP BY
    • HAVING
  14. Modeling 3
    • Normalization
    • First normal form
    • Second normal form
    • Third normal form
    • Normalized vs de-normalized
  15. Writing 3
    • Transactions
    • ACID
  16. Application 3
    • SOLID
    • Layered architecture
    • Dependency injection
    • IRepository
  17. DBMS 4
    • Storage level
    • DBMS file structure
    • Database buffer
    • Journaling
    • Database engines
  18. Modeling 4
    • Query optimization
    • Data structures overview
    • indexes
    • Fulltext search
    • JOIN strategies
  19. Not Only SQL 4
    • NoSQL categories
    • Key-value storage
    • Wide column storage
    • Document storage
    • Graph storage