SeoSanitizer::cleanSEOTitle() - Code Metrics - Inspection of "style, cosmetic" - Dispositif/Wikibot - Measure and Improve Code Quality continuously with Scrutinizer

Passed

Push — master ( dff8a4...2556d0 )

by Dispositif

created 2023-04-08 11:33 UTC

SeoSanitizer::cleanSEOTitle() B

↳ Parent: SeoSanitizer

Complexity

Conditions	8
Paths	5

Size

Total Lines	37
Code Lines	18

Duplication

Lines	0
Ratio	0 %

Importance

Changes	1
Bugs	0	Features	0

Metric	Value
cc	8
eloc	18
c	1
b	0
f	0
nc	5
nop	2
dl	0
loc	37
rs	8.4444

<?php
/*
 * This file is part of dispositif/wikibot application (@github)
 * 2019-2023 © Philippe M./Irønie  <[email protected]>
 * For the full copyright and MIT license information, view the license file.
 */

declare(strict_types=1);

namespace App\Domain\Publisher;

use App\Domain\Utils\TextUtil;

class SeoSanitizer
{
    /**
     * Naive SEO sanitization of web page title.
     * pretty domain name as "google.com" or "google.co.uk"
     */
    public function cleanSEOTitle(string $prettyDomainName, ?string $title): ?string
    {
        if (empty(trim($title))) {
        if (empty(trim(/** @scrutinizer ignore-type */ $title))) {
            return null;
        }
        $title = str_replace(['–','—','\\'],['-','-','/'], $title); // replace em dash with hyphen

        $seoSegments = $this->extractSEOSegments($title);
        // No SEO segmentation found
        if (count($seoSegments) < 2) {
            return $title;
        }

        // remove seo segments containing same words as the website domain name
        $seoSegmentsFiltered = array_values(array_filter($seoSegments, function ($segment) use ($prettyDomainName) {
            // strip string after last dot in prettyDomainName : blabla.com => blabla
            $domainMain = preg_replace('/\.[^.]*$/', '', $prettyDomainName);
            // strip string if only 2 chars after dot : like in 'blabla.co.uk'
            $domainMain = preg_replace('/\.[^.]{2}$/', '', $domainMain);

            $strippedSegment = str_replace([' ', '-'], '', mb_strtolower(TextUtil::stripPunctuation(TextUtil::stripAccents($segment))));

            return !empty(trim($segment))
                && false === strpos($strippedSegment, str_replace(['.', '-'], '', $prettyDomainName))
                && false === strpos($strippedSegment, str_replace(['.', '-'], '', $domainMain));
        }));
        if (count($seoSegmentsFiltered) === 0) {
            return trim($seoSegments[0]); // before filtering
        }

        // if only one segment or first segment is long enough, return it
        if (count($seoSegmentsFiltered) === 1 || mb_strlen($seoSegmentsFiltered[0]) >= 25) {
            return trim($seoSegmentsFiltered[0]);
        }

        // rebuild bestTitle but keep only the first 2 SEO segments
        return trim($seoSegmentsFiltered[0]) . ' - ' . trim($seoSegmentsFiltered[1]);
    }

    private function extractSEOSegments(string $title): array
    {
        $seoSeparator = $this->getSEOSeparator($title);
        if (null === $seoSeparator) {
            return [$title];
        }

        return explode($seoSeparator, $title);
    }

    private function getSEOSeparator(string $title): ?string
    {
        $title = str_replace(' — ', ' - ', $title);
        if (strpos($title, ' | ') !== false) {
            return ' | ';
        }
        if (strpos($title, ' - ') !== false) {
            return ' - ';
        }
        if (strpos($title, ' / ') !== false) {
            return ' / ';
        }
        if (strpos($title, ' : ') !== false) {
            return ' : ';
        }

        return null;
    }
}

Dispositif / Wikibot

Push — master ( dff8a4...2556d0 )

SeoSanitizer::cleanSEOTitle() B

Complexity

Size

Duplication

Importance

Duplication Side-by-Side

Filter issues like