๐Ÿ“„

Linq Fundamentals

Beginner 4 min read 700 words

LINQ Fundamentals

Introduction

Language Integrated Query (LINQ) provides a consistent query experience across different data sourcesโ€”collections, databases, XML, and more. LINQ brings query capabilities directly into C# with compile-time checking and IntelliSense support.


Table of Contents


Query Syntax vs Method Syntax

LINQ offers two syntaxes that compile to the same IL code:

Query Syntax (SQL-like)

var result = from student in students
             where student.Age > 18
             orderby student.Name
             select student.Name;

Method Syntax (Fluent/Lambda)

var result = students
    .Where(s => s.Age > 18)
    .OrderBy(s => s.Name)
    .Select(s => s.Name);

Comparison

Aspect Query Syntax Method Syntax
Readability Better for complex queries with joins Better for simple chains
Completeness Supports subset of operators Supports all operators
Joins/Groups More intuitive Requires explicit method calls
Learning curve Easier for SQL developers Easier for C# developers

When to Use Each

// Query syntax: Better for complex joins
var query = from order in orders
            join customer in customers on order.CustomerId equals customer.Id
            join product in products on order.ProductId equals product.Id
            where order.Date > DateTime.Now.AddDays(-30)
            orderby order.Date descending
            select new { customer.Name, product.Name, order.Date };

// Method syntax: Better for simple operations
var activeUsers = users.Where(u => u.IsActive).ToList();

// Method syntax: Required for some operators
var firstThree = users.Take(3);  // No query syntax equivalent
var distinctNames = users.Select(u => u.Name).Distinct();

Deferred vs Immediate Execution

Understanding execution timing is crucial for LINQ performance.

Deferred Execution

The query is not executed until the result is enumerated:

var numbers = new List<int> { 1, 2, 3, 4, 5 };

// Query is defined but NOT executed
var query = numbers.Where(n => n > 2);

// Modify the source
numbers.Add(6);

// NOW the query executes - includes 6!
foreach (var n in query)
{
    Console.WriteLine(n);  // 3, 4, 5, 6
}

Immediate Execution

These operators force immediate execution:

var numbers = new List<int> { 1, 2, 3, 4, 5 };

// These execute immediately:
var list = numbers.Where(n => n > 2).ToList();      // ToList()
var array = numbers.Where(n => n > 2).ToArray();    // ToArray()
var count = numbers.Count(n => n > 2);              // Count()
var first = numbers.First(n => n > 2);              // First()
var sum = numbers.Sum();                            // Sum()
var dict = numbers.ToDictionary(n => n);            // ToDictionary()
var exists = numbers.Any(n => n > 10);              // Any()
var all = numbers.All(n => n > 0);                  // All()

Multiple Enumeration Problem

// โŒ BAD: Query executes twice
IEnumerable<User> activeUsers = GetUsers().Where(u => u.IsActive);

Console.WriteLine($"Count: {activeUsers.Count()}");  // First execution
foreach (var user in activeUsers)                    // Second execution
{
    ProcessUser(user);
}

// โœ… GOOD: Materialize once
List<User> activeUsers = GetUsers().Where(u => u.IsActive).ToList();

Console.WriteLine($"Count: {activeUsers.Count}");  // Property, not method
foreach (var user in activeUsers)
{
    ProcessUser(user);
}

Streaming vs Non-Streaming

// Streaming operators (process one element at a time)
// - Where, Select, Skip, Take, SelectMany
var streamed = numbers.Where(n => n > 0).Select(n => n * 2);

// Non-streaming operators (need all elements)
// - OrderBy, GroupBy, Reverse, Distinct, Union, Intersect
var sorted = numbers.OrderBy(n => n);  // Must see all elements to sort

// Buffering operators (store all elements)
// - ToList, ToArray, ToDictionary
var buffered = numbers.ToList();  // Stores entire result

Common LINQ Operators

Operator Categories

Category Operators
Filtering Where, OfType
Projection Select, SelectMany
Sorting OrderBy, OrderByDescending, ThenBy, ThenByDescending, Reverse
Grouping GroupBy, ToLookup
Joining Join, GroupJoin
Set Distinct, Union, Intersect, Except
Aggregation Count, Sum, Min, Max, Average, Aggregate
Quantifiers Any, All, Contains
Partitioning Take, Skip, TakeWhile, SkipWhile
Element First, FirstOrDefault, Single, SingleOrDefault, Last, ElementAt
Generation Range, Repeat, Empty
Conversion ToList, ToArray, ToDictionary, ToLookup, AsEnumerable, Cast

Projection Operations

Select - Transform Each Element

var users = new List<User>
{
    new User { Id = 1, FirstName = "John", LastName = "Doe", Age = 30 },
    new User { Id = 2, FirstName = "Jane", LastName = "Smith", Age = 25 }
};

// Simple projection
var names = users.Select(u => u.FirstName);
// Result: ["John", "Jane"]

// Anonymous type projection
var summary = users.Select(u => new
{
    FullName = $"{u.FirstName} {u.LastName}",
    IsAdult = u.Age >= 18
});

// Index-aware projection
var indexed = users.Select((u, index) => new { Index = index, User = u });

// Projection with calculation
var ages = users.Select(u => new
{
    u.FirstName,
    BirthYear = DateTime.Now.Year - u.Age
});

SelectMany - Flatten Collections

var departments = new List<Department>
{
    new Department
    {
        Name = "IT",
        Employees = new List<string> { "Alice", "Bob" }
    },
    new Department
    {
        Name = "HR",
        Employees = new List<string> { "Charlie" }
    }
};

// Flatten nested collections
var allEmployees = departments.SelectMany(d => d.Employees);
// Result: ["Alice", "Bob", "Charlie"]

// With result selector
var employeeDetails = departments.SelectMany(
    d => d.Employees,
    (dept, emp) => new { Department = dept.Name, Employee = emp }
);
// Result: [{ IT, Alice }, { IT, Bob }, { HR, Charlie }]

// Cartesian product
var colors = new[] { "Red", "Blue" };
var sizes = new[] { "S", "M", "L" };

var combinations = colors.SelectMany(
    c => sizes,
    (color, size) => $"{color}-{size}"
);
// Result: ["Red-S", "Red-M", "Red-L", "Blue-S", "Blue-M", "Blue-L"]

Filtering Operations

Where - Filter by Condition

var products = GetProducts();

// Simple filter
var expensive = products.Where(p => p.Price > 100);

// Multiple conditions
var available = products.Where(p => p.Price > 50 && p.InStock);

// Index-aware filter
var evenIndexed = products.Where((p, index) => index % 2 == 0);

// Complex predicate
Func<Product, bool> isOnSale = p =>
    p.DiscountPercent > 0 &&
    p.SaleEndDate > DateTime.Now;

var saleItems = products.Where(isOnSale);

OfType - Filter by Type

var items = new object[] { 1, "hello", 2, "world", 3.14 };

var strings = items.OfType<string>();  // ["hello", "world"]
var integers = items.OfType<int>();    // [1, 2]

// Useful with inheritance
var shapes = new List<Shape> { new Circle(), new Rectangle(), new Circle() };
var circles = shapes.OfType<Circle>();  // Only Circle instances

Sorting Operations

OrderBy / OrderByDescending

var users = GetUsers();

// Ascending order
var byName = users.OrderBy(u => u.Name);

// Descending order
var byAgeDesc = users.OrderByDescending(u => u.Age);

// Multiple sort criteria
var sorted = users
    .OrderBy(u => u.Department)
    .ThenByDescending(u => u.Salary)
    .ThenBy(u => u.Name);

// Custom comparer
var caseInsensitive = users.OrderBy(
    u => u.Name,
    StringComparer.OrdinalIgnoreCase
);

// Reverse existing order
var reversed = users.OrderBy(u => u.Id).Reverse();

Query Syntax Sorting

var sorted = from user in users
             orderby user.Department, user.Salary descending, user.Name
             select user;

Grouping Operations

GroupBy

var orders = GetOrders();

// Simple grouping
var byCustomer = orders.GroupBy(o => o.CustomerId);

// Iterate groups
foreach (var group in byCustomer)
{
    Console.WriteLine($"Customer {group.Key}:");
    foreach (var order in group)
    {
        Console.WriteLine($"  Order {order.Id}: ${order.Total}");
    }
}

// Group with element selector
var orderTotals = orders.GroupBy(
    o => o.CustomerId,
    o => o.Total  // Select only the total
);

// Group with result selector
var customerSummary = orders.GroupBy(
    o => o.CustomerId,
    (customerId, customerOrders) => new
    {
        CustomerId = customerId,
        OrderCount = customerOrders.Count(),
        TotalSpent = customerOrders.Sum(o => o.Total)
    }
);

// Composite key grouping
var byMonthAndYear = orders.GroupBy(o => new
{
    o.OrderDate.Year,
    o.OrderDate.Month
});

// Query syntax grouping
var grouped = from order in orders
              group order by order.CustomerId into customerGroup
              select new
              {
                  CustomerId = customerGroup.Key,
                  Orders = customerGroup.ToList()
              };

ToLookup - Immediate GroupBy

// ToLookup executes immediately (unlike GroupBy)
var lookup = orders.ToLookup(o => o.CustomerId);

// Access groups directly by key
var customer1Orders = lookup[1];  // Returns all orders for customer 1
var customer999Orders = lookup[999];  // Returns empty, not null

// Useful for repeated lookups
foreach (var customerId in customerIds)
{
    var customerOrders = lookup[customerId];  // O(1) lookup
    ProcessOrders(customerOrders);
}

Join Operations

Inner Join

var customers = GetCustomers();
var orders = GetOrders();

// Method syntax
var customerOrders = customers.Join(
    orders,
    customer => customer.Id,
    order => order.CustomerId,
    (customer, order) => new
    {
        CustomerName = customer.Name,
        OrderId = order.Id,
        OrderTotal = order.Total
    }
);

// Query syntax (more readable for joins)
var query = from customer in customers
            join order in orders on customer.Id equals order.CustomerId
            select new
            {
                CustomerName = customer.Name,
                OrderId = order.Id,
                OrderTotal = order.Total
            };

Left Outer Join (GroupJoin)

// Method syntax
var leftJoin = customers.GroupJoin(
    orders,
    customer => customer.Id,
    order => order.CustomerId,
    (customer, customerOrders) => new
    {
        CustomerName = customer.Name,
        Orders = customerOrders.ToList()
    }
);

// Query syntax with DefaultIfEmpty for true left join
var leftOuterJoin = from customer in customers
                    join order in orders on customer.Id equals order.CustomerId into customerOrders
                    from co in customerOrders.DefaultIfEmpty()
                    select new
                    {
                        CustomerName = customer.Name,
                        OrderId = co?.Id,
                        OrderTotal = co?.Total ?? 0
                    };

Multiple Joins

var result = from order in orders
             join customer in customers on order.CustomerId equals customer.Id
             join product in products on order.ProductId equals product.Id
             join category in categories on product.CategoryId equals category.Id
             select new
             {
                 OrderId = order.Id,
                 CustomerName = customer.Name,
                 ProductName = product.Name,
                 CategoryName = category.Name,
                 Total = order.Total
             };

Cross Join

// All combinations
var crossJoin = from color in colors
                from size in sizes
                select new { color, size };

// Or with SelectMany
var cross = colors.SelectMany(c => sizes, (c, s) => new { Color = c, Size = s });

Aggregation Operations

Basic Aggregations

var numbers = new[] { 1, 2, 3, 4, 5 };

int count = numbers.Count();              // 5
int countEven = numbers.Count(n => n % 2 == 0);  // 2

int sum = numbers.Sum();                  // 15
int sumOfSquares = numbers.Sum(n => n * n);  // 55

int min = numbers.Min();                  // 1
int max = numbers.Max();                  // 5

double average = numbers.Average();       // 3.0

// With selector
var users = GetUsers();
int totalAge = users.Sum(u => u.Age);
int maxAge = users.Max(u => u.Age);
double avgAge = users.Average(u => u.Age);

MinBy / MaxBy (C# 10+)

var products = GetProducts();

// Get the entire object with min/max property
var cheapest = products.MinBy(p => p.Price);
var mostExpensive = products.MaxBy(p => p.Price);

// Pre-C# 10 equivalent
var cheapestOld = products.OrderBy(p => p.Price).First();

Custom Aggregation with Aggregate

var numbers = new[] { 1, 2, 3, 4 };

// Sum using Aggregate
int sum = numbers.Aggregate((acc, n) => acc + n);  // 10

// With seed value
int sumPlusTen = numbers.Aggregate(10, (acc, n) => acc + n);  // 20

// String concatenation
var words = new[] { "Hello", "World", "!" };
string sentence = words.Aggregate((acc, word) => acc + " " + word);
// "Hello World !"

// Complex aggregation: Calculate variance
var values = new[] { 2.0, 4.0, 4.0, 4.0, 5.0, 5.0, 7.0, 9.0 };
double mean = values.Average();  // 5.0
double variance = values.Aggregate(
    0.0,
    (acc, val) => acc + Math.Pow(val - mean, 2),
    acc => acc / values.Length
);  // 4.0

Set Operations

var list1 = new[] { 1, 2, 3, 4, 5 };
var list2 = new[] { 4, 5, 6, 7, 8 };

// Distinct - Remove duplicates
var unique = new[] { 1, 1, 2, 2, 3 }.Distinct();  // [1, 2, 3]

// DistinctBy (C# 10+) - Distinct by property
var uniqueUsers = users.DistinctBy(u => u.Email);

// Union - All unique elements from both
var union = list1.Union(list2);  // [1, 2, 3, 4, 5, 6, 7, 8]

// Intersect - Common elements
var common = list1.Intersect(list2);  // [4, 5]

// Except - In first but not in second
var diff = list1.Except(list2);  // [1, 2, 3]

// Set operations with custom comparer
var users1 = GetTeam1Users();
var users2 = GetTeam2Users();
var uniqueByEmail = users1.Union(users2, new UserEmailComparer());

public class UserEmailComparer : IEqualityComparer<User>
{
    public bool Equals(User x, User y) =>
        x?.Email?.ToLower() == y?.Email?.ToLower();

    public int GetHashCode(User obj) =>
        obj.Email?.ToLower().GetHashCode() ?? 0;
}

Partitioning Operations

var numbers = Enumerable.Range(1, 10);  // [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

// Take - First N elements
var firstThree = numbers.Take(3);  // [1, 2, 3]

// Skip - Skip first N elements
var skipThree = numbers.Skip(3);  // [4, 5, 6, 7, 8, 9, 10]

// TakeLast / SkipLast (C# 8+)
var lastThree = numbers.TakeLast(3);  // [8, 9, 10]
var skipLastThree = numbers.SkipLast(3);  // [1, 2, 3, 4, 5, 6, 7]

// TakeWhile - Take while condition is true
var takeWhileLessThan5 = numbers.TakeWhile(n => n < 5);  // [1, 2, 3, 4]

// SkipWhile - Skip while condition is true
var skipWhileLessThan5 = numbers.SkipWhile(n => n < 5);  // [5, 6, 7, 8, 9, 10]

// Pagination
int pageSize = 10;
int pageNumber = 3;
var page = items.Skip((pageNumber - 1) * pageSize).Take(pageSize);

// Chunk (C# 10+) - Split into batches
var batches = numbers.Chunk(3);  // [[1,2,3], [4,5,6], [7,8,9], [10]]

Performance Considerations

ToList() vs AsEnumerable() vs AsQueryable()

// ToList() - Immediate execution, stores all in memory
List<User> users = dbContext.Users.Where(u => u.IsActive).ToList();

// AsEnumerable() - Switch from IQueryable to IEnumerable
// Subsequent operations execute in memory, not database
var result = dbContext.Users
    .Where(u => u.IsActive)     // Executes in database
    .AsEnumerable()
    .Where(u => CustomLogic(u)); // Executes in memory

// AsQueryable() - Enable LINQ-to-SQL on IEnumerable
var queryable = localList.AsQueryable();

Avoid Multiple Enumerations

// โŒ BAD: Enumerates twice
IEnumerable<User> users = GetExpensiveQuery();
if (users.Any())
{
    foreach (var user in users) { }  // Second enumeration!
}

// โœ… GOOD: Enumerate once
var users = GetExpensiveQuery().ToList();
if (users.Any())
{
    foreach (var user in users) { }
}

// โœ… ALTERNATIVE: Use null pattern
var firstUser = users.FirstOrDefault();
if (firstUser != null)
{
    // Process firstUser and continue enumeration
}

Materialization Points

// Query builds up (deferred)
var query = orders
    .Where(o => o.Date > DateTime.Now.AddDays(-30))
    .OrderBy(o => o.Date);

// Materializes here - executes query
var list = query.ToList();

// Common materialization methods:
// ToList(), ToArray(), ToDictionary(), ToLookup()
// First(), Single(), Last(), ElementAt()
// Count(), Sum(), Min(), Max(), Average(), Aggregate()
// Any(), All(), Contains()

IQueryable vs IEnumerable Performance

// IQueryable<T> - Expression tree, database-side execution
var dbQuery = dbContext.Orders
    .Where(o => o.Total > 100)    // Translated to SQL WHERE
    .OrderBy(o => o.Date)         // Translated to SQL ORDER BY
    .Take(10);                    // Translated to SQL TOP/LIMIT

// IEnumerable<T> - In-memory execution
var memoryQuery = orders.AsEnumerable()
    .Where(o => o.Total > 100)    // Filters in memory
    .OrderBy(o => o.Date)         // Sorts in memory
    .Take(10);                    // Takes in memory

// โš ๏ธ DANGER: Accidentally switching to client-side
var badQuery = dbContext.Orders
    .ToList()                     // Loads ALL orders into memory!
    .Where(o => o.Total > 100);   // Then filters in memory

PLINQ Basics

Parallel LINQ enables parallel query execution on multi-core systems.

Basic PLINQ

var numbers = Enumerable.Range(1, 1_000_000);

// Sequential
var sequential = numbers
    .Where(n => IsPrime(n))
    .ToList();

// Parallel
var parallel = numbers
    .AsParallel()
    .Where(n => IsPrime(n))
    .ToList();

// Control degree of parallelism
var limited = numbers
    .AsParallel()
    .WithDegreeOfParallelism(4)  // Max 4 threads
    .Where(n => IsPrime(n))
    .ToList();

Preserving Order

// Order not guaranteed by default
var unordered = numbers
    .AsParallel()
    .Select(n => n * 2)
    .ToList();  // May be [4, 2, 8, 6, ...]

// Preserve original order (some performance cost)
var ordered = numbers
    .AsParallel()
    .AsOrdered()
    .Select(n => n * 2)
    .ToList();  // Always [2, 4, 6, 8, ...]

When to Use PLINQ

// โœ… Good candidates for PLINQ:
// - CPU-bound operations
// - Large data sets (thousands of elements)
// - Independent operations (no shared state)

// โŒ Bad candidates:
// - I/O-bound operations (use async instead)
// - Small data sets (parallelization overhead > benefit)
// - Operations with side effects
// - Operations requiring ordering

// Example: Good use case
var results = largeDataSet
    .AsParallel()
    .Select(item => ExpensiveComputation(item))
    .ToList();

// Example: Bad use case (I/O bound)
var files = fileNames
    .AsParallel()  // โŒ I/O bound, threads will block
    .Select(f => File.ReadAllText(f))
    .ToList();

// Better approach for I/O
var filesAsync = await Task.WhenAll(
    fileNames.Select(f => File.ReadAllTextAsync(f))
);

ForAll - Parallel Side Effects

// ForAll executes action in parallel without waiting
numbers
    .AsParallel()
    .Where(n => n % 2 == 0)
    .ForAll(n => Console.WriteLine(n));  // Order not guaranteed

// vs sequential ForEach
numbers
    .AsParallel()
    .Where(n => n % 2 == 0)
    .ToList()
    .ForEach(n => Console.WriteLine(n));  // Sequential after ToList

Common Pitfalls

โŒ Capturing Loop Variables

// โŒ BAD: All lambdas capture the same variable
var queries = new List<Func<int>>();
for (int i = 0; i < 5; i++)
{
    queries.Add(() => i);  // All return 5!
}

// โœ… FIX: Capture in local variable
for (int i = 0; i < 5; i++)
{
    int local = i;
    queries.Add(() => local);  // Returns 0, 1, 2, 3, 4
}

โŒ Side Effects in LINQ

// โŒ BAD: Side effects in Where
int count = 0;
var result = numbers.Where(n =>
{
    count++;  // Side effect!
    return n > 5;
});
// count might not be what you expect due to deferred execution

// โœ… GOOD: Separate concerns
var filtered = numbers.Where(n => n > 5).ToList();
int count = filtered.Count;

โŒ Null Reference in LINQ

// โŒ Potential NullReferenceException
var names = users.Select(u => u.Address.City);

// โœ… Handle nulls
var names = users
    .Where(u => u.Address != null)
    .Select(u => u.Address.City);

// Or with null-conditional
var names = users.Select(u => u.Address?.City);

Best Practices

โœ… Use Method Chains Wisely

// Good: Clear and readable
var result = orders
    .Where(o => o.IsActive)
    .OrderByDescending(o => o.Date)
    .Take(10)
    .Select(o => new OrderDto(o))
    .ToList();

// Avoid: Too many transformations, hard to debug
var complex = data
    .Where(x => x.A)
    .Select(x => x.B)
    .SelectMany(x => x.C)
    .GroupBy(x => x.D)
    .Select(g => g.E)
    // ... many more
    .ToList();

โœ… Materialize When Needed

// Materialize before multiple operations
var users = GetUsers().ToList();
var activeCount = users.Count(u => u.IsActive);
var inactiveCount = users.Count(u => !u.IsActive);

// Or use GroupBy for single enumeration
var counts = users.GroupBy(u => u.IsActive)
    .ToDictionary(g => g.Key, g => g.Count());

โœ… Use Appropriate Operators

// Use Any() instead of Count() > 0
if (users.Any(u => u.IsAdmin)) { }  // โœ… Stops at first match
if (users.Count(u => u.IsAdmin) > 0) { }  // โŒ Counts all

// Use FirstOrDefault instead of Where().First()
var user = users.FirstOrDefault(u => u.Id == id);  // โœ…
var user = users.Where(u => u.Id == id).First();   // โŒ Extra allocation

Interview Questions

1. What is deferred execution in LINQ?

Answer: Deferred execution means the query is not executed when itโ€™s defined, but when itโ€™s enumerated (via foreach, ToList, etc.). This allows:

  • Building queries incrementally
  • Getting updated results if source data changes
  • Avoiding unnecessary computation

Operators like Where, Select, OrderBy use deferred execution. Operators like ToList, Count, First force immediate execution.


2. Whatโ€™s the difference between IEnumerable<T> and IQueryable<T>?

Answer:

  • IEnumerable: Executes in memory using delegates. LINQ operations compile to method calls. Best for in-memory collections.
  • IQueryable: Uses expression trees that can be translated to other query languages (SQL, etc.). LINQ operations build an expression tree thatโ€™s translated by the provider.
// IEnumerable - filter happens in memory
var memory = users.AsEnumerable().Where(u => u.Age > 18);

// IQueryable - filter translated to SQL WHERE clause
var query = dbContext.Users.Where(u => u.Age > 18);

3. What is the difference between Select and SelectMany?

Answer:

  • Select: Maps each element to exactly one result (1:1 mapping)
  • SelectMany: Maps each element to multiple results and flattens them (1:N mapping)
var departments = GetDepartments();

// Select: Returns IEnumerable<List<Employee>>
var nested = departments.Select(d => d.Employees);

// SelectMany: Returns IEnumerable<Employee> (flattened)
var flat = departments.SelectMany(d => d.Employees);

4. How do you perform a left outer join in LINQ?

Answer: Use GroupJoin with DefaultIfEmpty():

var leftJoin = from customer in customers
               join order in orders on customer.Id equals order.CustomerId into customerOrders
               from co in customerOrders.DefaultIfEmpty()
               select new
               {
                   CustomerName = customer.Name,
                   OrderId = co?.Id
               };

5. When should you use PLINQ?

Answer: Use PLINQ when:

  • You have CPU-bound, parallelizable work
  • Large datasets (thousands+ elements)
  • Operations are independent (no shared state)
  • Each operation takes significant time

Avoid PLINQ for:

  • I/O-bound operations (use async instead)
  • Small datasets (overhead > benefit)
  • Operations requiring specific ordering
  • Operations with side effects

6. Whatโ€™s the difference between First() and Single()?

Answer:

  • First(): Returns first matching element. Throws if sequence is empty.
  • FirstOrDefault(): Returns first or default(T) if empty.
  • Single(): Returns single matching element. Throws if empty OR more than one match.
  • SingleOrDefault(): Returns single element, default if empty. Throws if more than one.

Use Single when you expect exactly one match (validates business logic). Use First when you want any match.


Sources