2022-04-26

记一次MongoDB处理rollback失败 replSet too much data to roll back

环境说明

操作系统：CentOS Linux release 8.2.2004 (Core)
MongoDB版本：3.6.21

问题描述

线上MongoDB 节点挂了，自动拉起之后，没过多久起不来了….
查看日志发现是rollback失败了

$ grep 'rsBackgroundSync' replication.log
2022-04-25T15:56:22.777+0800 I REPL     [rsBackgroundSync] Starting rollback due to OplogStartMissing: Our last op time fetched: { ts: Timestamp(1650870067, 9), t: 149 }. source's GTE: { ts: Timestamp(1650870144, 1), t: 150 } hashes: (-4603273711463716908/-773514121576334543)
2022-04-25T15:56:22.777+0800 I REPL     [rsBackgroundSync] Replication commit point: { ts: Timestamp(0, 0), t: -1 }
2022-04-25T15:56:22.777+0800 I REPL     [rsBackgroundSync] Rollback using the 'rollbackViaRefetch' method because UUID support is feature compatible with featureCompatibilityVersion 3.6.
2022-04-25T15:56:22.777+0800 I REPL     [rsBackgroundSync] transition to ROLLBACK from SECONDARY
2022-04-25T15:56:22.777+0800 I NETWORK  [rsBackgroundSync] Skip closing connection for connection # 1
2022-04-25T15:56:22.777+0800 I ROLLBACK [rsBackgroundSync] Starting rollback. Sync source: 172.16.31.47:27018
2022-04-25T15:56:22.779+0800 I ROLLBACK [rsBackgroundSync] Finding the Common Point
2022-04-25T15:56:22.782+0800 I ROLLBACK [rsBackgroundSync] our last optime:   Timestamp(1650870067, 9)
2022-04-25T15:56:22.782+0800 I ROLLBACK [rsBackgroundSync] their last optime: Timestamp(1650873382, 167)
2022-04-25T15:56:22.782+0800 I ROLLBACK [rsBackgroundSync] diff in end of log times: -3315 seconds
2022-04-25T15:56:45.012+0800 I ROLLBACK [rsBackgroundSync] Rollback common point is { ts: Timestamp(1650869833, 2586), t: 149 }
2022-04-25T15:56:45.012+0800 I REPL     [rsBackgroundSync] Incremented the rollback ID to 22
2022-04-25T15:56:45.012+0800 I ROLLBACK [rsBackgroundSync] Starting refetching documents
2022-04-25T15:57:58.580+0800 I ROLLBACK [rsBackgroundSync] Rollback finished. The final minValid is: { ts: Timestamp(1650776551, 102), t: 148 }
2022-04-25T15:57:58.580+0800 F ROLLBACK [rsBackgroundSync] Unable to complete rollback. A full resync may be needed: UnrecoverableRollbackError: replSet too much data to roll back.
2022-04-25T15:57:58.580+0800 F -        [rsBackgroundSync] Fatal Assertion 40507 at src/mongo/db/repl/rs_rollback.cpp 1516
2022-04-25T15:57:58.580+0800 F -        [rsBackgroundSync] \n\n***aborting after fassert() failure\n\n

报错：Unable to complete rollback. A full resync may be needed: UnrecoverableRollbackError: replSet too much data to roll back

去网上找了一圈，发现并没有处理方案

rollback失败的原理可以去看其他人写的文章，这里就不赘述了：

https://www.cnblogs.com/andy6/p/9837388.html

问题处理

https://jira.mongodb.org/browse/SERVER-47918

Under condition #1 the 300MB rollback limit is no longer enforced post-4.0

这里说了，4.0之后就没有这个限制了…

那会不会是硬编码限制，既然是硬编码限制那是不是可以通过改代码来解决？

拉代码看一下

$ git checkout r3.6.21

打开vim src/mongo/db/repl/rs_rollback.cpp的1028行

            // Checks that the total amount of data that needs to be refetched is at most
            // 300 MB. We do not roll back more than 300 MB of documents in order to
            // prevent out of memory errors from too much data being stored. See SERVER-23392.
            if (totalSize >= 300 * 1024 * 1024) {
                throw RSFatalException("replSet too much data to roll back.");
            }

发现只是一个硬编码限制，那就把这段if注释了，重新编译下mongod。

编译过程不在赘述了，后面问题就解决了

2021-11-24

Openshift

OpenShift webconsole proxy实现原理

前言

今天线上出问题了，访问console直接503。

根据console的地址查，发现console的访问地址对应的服务是apiserver，令我很吃惊…看apiserver对应的报错日志：

I1229 19:15:18.181339   20697 logs.go:49] http: proxy error: x509: certificate has expired or is not yet valid: current time 2021-12-29T19:15:18+08:00 is after 2021-10-30T13:47:51Z

这….根本定位不出来啊！
只能去看apiserver的源代码。
（吐槽一下OpenShift的魔改！）

预备知识

大概看过apiserver的代码
了解go-restful

handler chain

api server的handler类似于java框架的filter机制（或者是Django的middleware），但是又有点不同，说多了反而不容易理解，比如我们有一个handlera、b、c，比如http配置的a，a有自己的逻辑，a的如果满足了某种逻辑会调用b，而b又可能由于某种逻辑会调用c，这就是handler chain，感兴趣的看去看BuildHandlerChain里面的逻辑。

约定

OpenShift版本：3.11

Kubernetes APIServer 启动流程

前面的解析不赘述了，直接跳到启动流程代码pkg/cmd/server/origin/master.go:179

func (c *MasterConfig) Run(stopCh <-chan struct{}) error {
    var err error
    var apiExtensionsInformers apiextensionsinformers.SharedInformerFactory
    var delegateAPIServer apiserver.DelegationTarget
    var extraPostStartHooks map[string]apiserver.PostStartHookFunc

    c.kubeAPIServerConfig.GenericConfig.BuildHandlerChainFunc, extraPostStartHooks, err = openshiftkubeapiserver.BuildHandlerChain(
        c.kubeAPIServerConfig.GenericConfig, c.ClientGoKubeInformers,
        c.Options.ControllerConfig.ServiceServingCert.Signer.CertFile, c.Options.OAuthConfig, c.Options.PolicyConfig.UserAgentMatchingConfig)
    if err != nil {
        return err
    }
    # ....
}

接着看BuildHandlerChain函数，跳到pkg/cmd/openshift-kube-apiserver/openshiftkubeapiserver/patch_handlerchain.go:28

func BuildHandlerChain(genericConfig *genericapiserver.Config, kubeInformers informers.SharedInformerFactory, legacyServiceServingCertSignerCABundle string, oauthConfig *configapi.OAuthConfig, userAgentMatchingConfig configapi.UserAgentMatchingConfig) (func(apiHandler http.Handler, kc *genericapiserver.Config) http.Handler, map[string]genericapiserver.PostStartHookFunc, error) {
    extraPostStartHooks := map[string]genericapiserver.PostStartHookFunc{}

    webconsoleProxyHandler, err := newWebConsoleProxy(kubeInformers, legacyServiceServingCertSignerCABundle)
    if err != nil {
        return nil, nil, err
    }
    oauthServerHandler, newPostStartHooks, err := NewOAuthServerHandler(genericConfig, oauthConfig)
    if err != nil {
        return nil, nil, err
    }
    for name, fn := range newPostStartHooks {
        extraPostStartHooks[name] = fn
    }

    return func(apiHandler http.Handler, genericConfig *genericapiserver.Config) http.Handler {
            // Machinery that let's use discover the Web Console Public URL
            accessor := newWebConsolePublicURLAccessor(genericConfig.LoopbackClientConfig)
            // the webconsole is proxied through the API server.  This starts a small controller that keeps track of where to proxy.
            // TODO stop proxying the webconsole. Should happen in a future release.
            extraPostStartHooks["openshift.io-webconsolepublicurl"] = func(context genericapiserver.PostStartHookContext) error {
                go accessor.Run(context.StopCh)
                return nil
            }

            // these are after the kube handler
            handler := versionSkewFilter(apiHandler, userAgentMatchingConfig)

            // this is the normal kube handler chain
            handler = genericapiserver.DefaultBuildHandlerChain(handler, genericConfig)

            // these handlers are all before the normal kube chain
            handler = translateLegacyScopeImpersonation(handler)
            handler = configprocessing.WithCacheControl(handler, "no-store") // protected endpoints should not be cached

            // redirects from / to /console if you're using a browser
            handler = withAssetServerRedirect(handler, accessor)

            // these handlers are actually separate API servers which have their own handler chains.
            // our server embeds these
            handler = withConsoleRedirection(handler, webconsoleProxyHandler, accessor)
            handler = withOAuthRedirection(oauthConfig, handler, oauthServerHandler)

            return handler
        },
        extraPostStartHooks,
        nil
}

newWebConsoleProxy的逻辑：

func newWebConsoleProxy(kubeInformers informers.SharedInformerFactory, legacyServiceServingCertSignerCABundle string) (http.Handler, error) {
    caBundle, err := ioutil.ReadFile(legacyServiceServingCertSignerCABundle)
    if err != nil {
        return nil, err
    }
    proxyHandler, err := newServiceProxyHandler("webconsole", "openshift-web-console", aggregatorapiserver.NewClusterIPServiceResolver(kubeInformers.Core().V1().Services().Lister()), caBundle, "OpenShift web console")
    if err != nil {
        return nil, err
    }
    return proxyHandler, nil
}

newServiceProxyHandler的逻辑：

// newServiceProxyHandler is a simple proxy that doesn't handle upgrades, passes headers directly through, and doesn't assert any identity.
func newServiceProxyHandler(serviceName string, serviceNamespace string, serviceResolver ServiceResolver, caBundle []byte, applicationDisplayName string) (*serviceProxyHandler, error) {
    restConfig := &restclient.Config{
        TLSClientConfig: restclient.TLSClientConfig{
            ServerName: serviceName + "." + serviceNamespace + ".svc",
            CAData:     caBundle,
        },
    }
    proxyRoundTripper, err := restclient.TransportFor(restConfig)
    if err != nil {
        return nil, err
    }

    return &serviceProxyHandler{
        serviceName:            serviceName,
        serviceNamespace:       serviceNamespace,
        serviceResolver:        serviceResolver,
        applicationDisplayName: applicationDisplayName,
        proxyRoundTripper:      proxyRoundTripper,
        restConfig:             restConfig,
    }, nil
}

serviceProxyHandler 结构体的说明：

proxyHandler provides a http.Handler which will proxy traffic to locations specified by items implementing Redirector.

serviceProxyHandler实现了ServeHTTP，在请求来了之后就会把请求透传到后端。
accessor会定期从configmap openshift-web-console/webconsole-config 读取console的URL，withConsoleRedirection会用到这个URL来判断请求是不是访问console的，如果是就把流量交给console，不是则调用下面的handler来处理请求。

2021-10-08

Openshift Catalog证书更新

故障排查

controller-manager 一直在BackOff状态：

$ oc -n kube-service-catalog get pods
...
controller-manager-x5dr8   0/1       CrashLoopBackOff   21         1h

查看日志：

$ oc -n kube-service-catalog logs -f --tail=30 controller-manager-x5dr8
I1008 06:24:21.655748       1 feature_gate.go:194] feature gates: map[OriginatingIdentity:true]
I1008 06:24:21.655957       1 feature_gate.go:194] feature gates: map[OriginatingIdentity:true AsyncBindingOperations:true]
I1008 06:24:21.655983       1 feature_gate.go:194] feature gates: map[OriginatingIdentity:true AsyncBindingOperations:true NamespacedServiceBroker:true]
I1008 06:24:21.656012       1 hyperkube.go:192] Service Catalog version v3.11.0-0.1.35+8d4f895-2;Upstream:v0.1.35 (built 2019-01-08T23:12:26Z)
I1008 06:24:21.659263       1 leaderelection.go:185] attempting to acquire leader lease  kube-service-catalog/service-catalog-controller-manager...
I1008 06:24:21.677905       1 leaderelection.go:194] successfully acquired lease kube-service-catalog/service-catalog-controller-manager
I1008 06:24:21.678992       1 event.go:221] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-service-catalog", Name:"service-catalog-controller-manager", UID:"f4993f8c-93f0-11e9-9c59-00163e0a2de7", APIVersion:"v1", ResourceVersion:"189814138", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' controller-manager-x5dr8-external-service-catalog-controller became leader
F1008 06:24:21.726721       1 controller_manager.go:237] error running controllers: failed to get api versions from server: failed to get supported resources from server: unable to retrieve the complete list of server APIs: servicecatalog.k8s.io/v1beta1: the server is currently unable to handle the request

这个报错有点奇怪的，去对应代码看了看，就是APIServer那边返回的，我直接用cURL看了一下：

$ TOKEN=$(oc whoami -t)
$ curl  -X GET -H "Authorization: Bearer ${TOKEN}" 'https://1.2.3.4:8443/apis/servicecatalog.k8s.io/v1beta1'
Error: 'x509: certificate has expired or is not yet valid'

原来是catalog服务的证书到期了。

解决方法

使用OpenShift-Ansible来重新部署更新证书。

复制一份playbooks/redeploy-certificates.yml，把playbooks/redeploy-certificates.yml中其他项目的playbook注释了，只保留init和catalog的。

$ cd openshift-ansible
$ cp -a  playbooks/redeploy-certificates.yml playbooks/redeploy-certificates-catalog.yml

$ cat playbooks/redeploy-certificates-catalog.yml
---
- import_playbook: init/main.yml

- import_playbook: openshift-service-catalog/private/redeploy-certificates.yml
  when: openshift_enable_service_catalog | default(true) | bool

重新部署：

$ ansible-playbook -i </path/to/inventory/file>  playbooks/redeploy-certificates-catalog.yml

部署完后确认：

curl  -I -X GET -H "Authorization: Bearer ${TOKEN}" 'https://1.2.3.4:8443/apis/servicecatalog.k8s.io/v1beta1'
HTTP/1.1 200 OK
Cache-Control: no-store
Content-Type: application/json
Date: Fri, 08 Oct 2021 06:37:39 GMT
Transfer-Encoding: chunked

2021-09-24

Openshift

OpenShift CNI 使用OpenvSwitch组网笔记

整体网络走向说明

本篇文章需要了解的知识

OpenShift

见官方文档：https://docs.openshift.com/

OpenVSwitch

建议也是去看官方文档：http://www.openvswitch.org/

如果英文不太好可以用翻译软件…
下面分享几篇大佬们写的不错的中文文章：

ovs 手册：

约定

OpenShift 版本：3.11
OpenFlow版本：OpenFlow13

集群网路配置

$ oc get clusternetwork
NAME      CLUSTER NETWORKS   SERVICE NETWORK   PLUGIN NAME
default   10.128.0.0/14:9    172.30.0.0/16     redhat/openshift-ovs-subnet

$ oc get clusternetwork default -o yaml
apiVersion: network.openshift.io/v1
clusterNetworks:
- CIDR: 10.128.0.0/14
  hostSubnetLength: 9
hostsubnetlength: 9
kind: ClusterNetwork
metadata:
  name: default
network: 10.128.0.0/14
pluginName: redhat/openshift-ovs-subnet
serviceNetwork: 172.30.0.0/16
vxlanPort: 4789

主机子网配置

$ oc get hostsubnets
NAME                 HOST                 HOST IP       SUBNET          EGRESS CIDRS   EGRESS IPS
oc-01.test   oc-01.test   172.16.1.15   10.128.0.0/23   []             []
oc-02.test   oc-02.test   172.16.1.16   10.129.0.0/23   []             []
oc-03.test   oc-03.test   172.16.1.22   10.130.0.0/23   []             []
oc-04.test   oc-04.test   172.16.1.25   10.131.0.0/23   []             []

集群网络策略

$ oc get networkpolicy --all-namespaces
No resources found.

PS：这里缺一张整体的网络流量图，待补充…

OpenVSwitch 怎么实现的组网

查看网络接口

$ ovs-vsctl show
6acf82ec-xxxx-xxxx-xxxx-xxxxxxxxxxx
    Bridge "br0"
        fail_mode: secure
....
        Port "vxlan0"
            Interface "vxlan0"
                type: vxlan
                options: {dst_port="4789", key=flow, remote_ip=flow}
        Port "br0"
            Interface "br0"
                type: internal
        Port "tun0"
            Interface "tun0"
                type: internal
        Port "vethxxxxx"
            Interface "vethxxxxx"
    ovs_version: "2.7.0"
....

$ ovs-ofctl -O OpenFlow13 show br0
OFPT_FEATURES_REPLY (OF1.3) (xid=0x2): dpid:00003edba3c69f45
n_tables:254, n_buffers:0
capabilities: FLOW_STATS TABLE_STATS PORT_STATS GROUP_STATS QUEUE_STATS
OFPST_PORT_DESC reply (OF1.3) (xid=0x3):
 1(vxlan0): addr:96:1c:47:a4:9f:70
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 2(tun0): addr:f6:f8:76:bb:7e:e9
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
....
OFPT_GET_CONFIG_REPLY (OF1.3) (xid=0x5): frags=nx-match miss_send_len=0

注：为节省阅读心智，上方省略了Pod的veth网卡。

vxlan0：用vxlan协议创建一个隧道，隧道提供给不同宿主机上pod之间的通讯。vxlan0的remote_ip最终受流表来控制，这样就用流表来实现灵活的控制。
tun0：tun0联通了br0交换机指宿主机的网络栈，同时配置了IP，用于和该节点上的容器通讯，给当前Host上的Pod提供了Gateway、DNS等功能。

另外还需要注意，br0的fail_mode是secure。

2021-04-16

Monitoring

Prometheus rate irate increase笔记

rate

func extrapolatedRate(vals []parser.Value, args parser.Expressions, enh *EvalNodeHelper, isCounter bool, isRate bool) Vector {
	ms := args[0].(*parser.MatrixSelector)
	vs := ms.VectorSelector.(*parser.VectorSelector)
	var (
	    // samples表示某个metric的某段时间的数据（区间向量）
		samples    = vals[0].(Matrix)[0]
		
		// enh.Ts: 执行查询的时间.
		// ms.Range: 区间向量表达式中括号内的时间转换成Duration.
		// vs.Offset: 区间向量表达式中后面跟的offset 的时间转换成Duration.
		rangeStart = enh.Ts - durationMilliseconds(ms.Range+vs.Offset)
		rangeEnd   = enh.Ts - durationMilliseconds(vs.Offset)
	)

	// No sense in trying to compute a rate without at least two points. Drop
	// this Vector element.
	if len(samples.Points) < 2 {
		return enh.Out
	}

	resultValue := samples.Points[len(samples.Points)-1].V - samples.Points[0].V
	// 如果exporter被重启，conter会从头开始计数，跟前面就不对应了，下面是斧正的逻辑
	if isCounter {
		var lastValue float64
		for _, sample := range samples.Points {
			if sample.V < lastValue {
				resultValue += lastValue
			}
			lastValue = sample.V
		}
	}

	// Duration between first/last samples and boundary of range.
	// 区间向量第一个指标的时间戳 - rangeStart，除1000表示以Milliseconds为时间单位。
	durationToStart := float64(samples.Points[0].T-rangeStart) / 1000
    // 区间向量最后个指标（距当前时间最近一次的指标）的时间戳 - rangeStart，除1000表示以Milliseconds为时间单位。
	durationToEnd := float64(rangeEnd-samples.Points[len(samples.Points)-1].T) / 1000

    // 取区间向量第一个和最后一个指标时间的差值，除1000表示以Milliseconds为时间单位。
	sampledInterval := float64(samples.Points[len(samples.Points)-1].T-samples.Points[0].T) / 1000
	// 平均时间间隔
	averageDurationBetweenSamples := sampledInterval / float64(len(samples.Points)-1)

	if isCounter && resultValue > 0 && samples.Points[0].V >= 0 {
		// Counters cannot be negative. If we have any slope at
		// all (i.e. resultValue went up), we can extrapolate
		// the zero point of the counter. If the duration to the
		// zero point is shorter than the durationToStart, we
		// take the zero point as the start of the series,
		// thereby avoiding extrapolation to negative counter
		// values.
		durationToZero := sampledInterval * (samples.Points[0].V / resultValue)
		if durationToZero < durationToStart {
			durationToStart = durationToZero
		}
	}

	// If the first/last samples are close to the boundaries of the range,
	// extrapolate the result. This is as we expect that another sample
	// will exist given the spacing between samples we've seen thus far,
	// with an allowance for noise.
	extrapolationThreshold := averageDurationBetweenSamples * 1.1
	extrapolateToInterval := sampledInterval

	if durationToStart < extrapolationThreshold {
		extrapolateToInterval += durationToStart
	} else {
		extrapolateToInterval += averageDurationBetweenSamples / 2
	}
	if durationToEnd < extrapolationThreshold {
		extrapolateToInterval += durationToEnd
	} else {
		extrapolateToInterval += averageDurationBetweenSamples / 2
	}
	resultValue = resultValue * (extrapolateToInterval / sampledInterval)
	if isRate {
		resultValue = resultValue / ms.Range.Seconds()
	}

	return append(enh.Out, Sample{
		Point: Point{V: resultValue},
	})
}


// === rate(node parser.ValueTypeMatrix) Vector ===
func funcRate(vals []parser.Value, args parser.Expressions, enh *EvalNodeHelper) Vector {
	return extrapolatedRate(vals, args, enh, true, true)
}

说明：
44行开始，推算就可以算出来具体的数值了，但是有些细节可以补充一下：

durationToStart和durationToEnd受查询时间和metric的scrape time影响，如果超出了extrapolationThreshold时间，durationToStart或durationToEnd的值 = averageDurationBetweenSamples / 2 。

例如：假设当前的指标：A(conter类型），每秒以10的的速度增长，我们需要采集60s的指标，该指标的设置的采集间隔为5s，第一次采集的时间为00:01 00，最后一次的采集时间为00:02 00，当前查询时间为00:02 03，查询语句为A[1m]，一般情况下，那么计算逻辑（伪代码）如下：

rangeStart = "00:02 03" (1m + 0) // 没有offset rangeStart == 00:01 03
rangeEnd = "00:02 03" - 0        // 没有offset rangeEnd == 00:02 03
durationToStart = (第一个metric的时间戳(00:02 00) - rangeStart) / 1000
durationToEnd = (最后一个metric的时间戳(00:01 00) - rangeStart) / 1000

sampledInterval = (第一个metric的时间戳(00:02 00)  - (最后一个metric的时间戳(00:01 00)  // sampledInterval = 60,0000

averageDurationBetweenSamples = sampledInterval / len(区间向量的数量，也就是12个) // averageDurationBetweenSamples = 60,0000 / 12

剩下的去套上面的程序（从第44行开始），至于resultValue可以随便编一个，但是要合理。

increase

// === rate(node parser.ValueTypeMatrix) Vector ===
func funcRate(vals []parser.Value, args parser.Expressions, enh *EvalNodeHelper) Vector {
	return extrapolatedRate(vals, args, enh, true, true)
}

increase和rate共用一个函数extrapolatedRate，只是结果不需要执行extrapolatedRate函数第76行的内容。

irate

源代码（2021/4/14）：


// === irate(node parser.ValueTypeMatrix) Vector ===
func funcIrate(vals []parser.Value, args parser.Expressions, enh *EvalNodeHelper) Vector {
	return instantValue(vals, enh.Out, true)
}
....

func instantValue(vals []parser.Value, out Vector, isRate bool) Vector {
	samples := vals[0].(Matrix)[0]
	// No sense in trying to compute a rate without at least two points. Drop
	// this Vector element.
	if len(samples.Points) < 2 {
		return out
	}

	lastSample := samples.Points[len(samples.Points)-1]
	previousSample := samples.Points[len(samples.Points)-2]

	var resultValue float64
	if isRate && lastSample.V < previousSample.V {
		// Counter reset.
		resultValue = lastSample.V
	} else {
		resultValue = lastSample.V - previousSample.V
	}

	sampledInterval := lastSample.T - previousSample.T
	if sampledInterval == 0 {
		// Avoid dividing by 0.
		return out
	}

	if isRate {
		// Convert to per-second.
		resultValue /= float64(sampledInterval) / 1000
	}

	return append(out, Sample{
		Point: Point{V: resultValue},
	})
}

irate最终的计算规则：

1	(倒数第一个Metric Value - 减倒数第二个metris Value) / (倒数第一个Metric抓取时间(秒） - 减倒数第二个metris抓取时间(秒）)

2021-04-16

ssh►tunnel

ssh tunnel 笔记

命令说明

约定：

SSH隧道发起者：本机，使用ssh主动建立连接的一方。
SSH隧道接受这：远端，被动接受ssh请求的一方。

ssh –help

-g 网关功能，会监听所有本地地址
-L 本地转发，格式为 ::，表示映射的本地端口和远程地址。注意remoteServerHost和RmotePort是指，只要是数据处理方可到达的地址。都有效并且可以使用。
-R 本地转发，格式为 ::，表示映射的本地端口和远程地址。注意remoteServerHost和RmotePort是指，只要是数据处理方可到达的地址。都有效并且可以使用。
-f 放到后台运行
-N -N不打开ssh回话，默认会开启ssh回话，一般配合-f使用。
-C 请求会话间的数据压缩传递。对于网络缓慢的主机，压缩对连接有所提升。但对网络流畅的主机来说，压缩只会更糟糕。
-q 静默模式。大多数警告信息将不输出。
-D 表示本地端口转发，当监听到此端口有连接时，此连接中的数据将通过安全隧道转发到server端，目前支持socks4和socks5协议。

本地转发

ssh隧道发起方监听端口。ssh隧道发起方接受请求，将数据包发送到远端服务器

1	$ ssh -g -N -f -C -L 8080:google:80 root@selinux.org

解释：在本机（SSH发起者/发送者）监听8080端口，数据包会送本机发送到selinux.org，selinux.org再转送给google:80

远程转发

ssh隧道接受方监听端口。ssh隧道发起方接受请求，将数据包发送到ssh隧道发起方。

1	$ ssh -g -N -f -C -R 6443:192.168.1.100:6443 root@selinux.org

解释：selinux监听一个6443端口，会把数据包发送到本地（ssh隧道发起者）然后发起者发送给192.168.1.100

socks4/socks5

1	$ ssh -D 1080 -N -f -C -q root@selinux.org

解释：本地监听端口，接受socks4/socks5协议，会将数据包发送到selinux.org，然后转发到数据包的要求的目的地址。

注意：上面两个都是讲的一对一的，这个是一对N，想跟谁沟通就可以跟谁沟通，不限地址和端口，而且支持TCP/UDP协议。

2020-04-03

Network

HTTPS笔记

摘自维基百科

超文本传输安全协议（英语：HyperText Transfer Protocol Secure，缩写：HTTPS；常称为 HTTP over TLS、HTTP over SSL 或 HTTP Secure）是一种通过计算机网络进行安全通信的传输协议。HTTPS 经由 HTTP 进行通信，但利用 SSL/TLS 来加密数据包。HTTPS 开发的主要目的，是提供对网站服务器的身份认证，保护交换数据的隐私与完整性。这个协议由网景公司（Netscape）在 1994 年首次提出，随后扩展到互联网上。

为什么需要HTTPS

HTTP传输面临的风险有：

窃听风险：黑客可以获知通信内容。
篡改风险：黑客可以修改通信内容。
冒充风险：黑客可以冒充他人身份参与通信。

SSL/TSL

如上图所示 HTTPS 相比 HTTP 多了一层 SSL/TLS

SSL（Secure Socket Layer，安全套接字层）：1994年为 Netscape 所研发，SSL 协议位于 TCP/IP 协议与各种应用层协议之间，为数据通讯提供安全支持。

TLS（Transport Layer Security，传输层安全）：其前身是 SSL，它最初的几个版本（SSL 1.0、SSL 2.0、SSL 3.0）由网景公司开发，1999年从 3.1 开始被 IETF 标准化并改名，发展至今已经有 TLS 1.0、TLS 1.1、TLS 1.2 三个版本。SSL3.0和TLS1.0由于存在安全漏洞，已经很少被使用到。TLS 1.3 改动会比较大，目前还在草案阶段，目前使用最广泛的是TLS 1.1、TLS 1.2。

加密算法

对称加密

有流式、分组两种，加密和解密都是使用的同一个密钥。

例如：DES、AES-GCM、ChaCha20-Poly1305等

非对称加密

加密使用的密钥和解密使用的密钥是不相同的，分别称为：公钥、私钥，公钥和算法都是公开的，私钥是保密的。非对称加密算法性能较低，但是安全性超强，由于其加密特性，非对称加密算法能加密的数据长度也是有限的。

例如：RSA、DSA、ECDSA、 DH、ECDHE

协议设计演变过程

按照上面讲的三个HTTP的传输风险，先来解决第一个问题，然后再去想办法解决第二个问题

注：这是经过我个人理解后的想法，并不是真实的标准。

不完全解决窃听风险

如果我们单纯的使用对称加密算法，按照上图来简单的设计。来加密协议的数据，是不能做到防窃听的，因为第四步中间人是可以接收到服务器公钥，那么服务器发来的数据中间人就可以解密。

而且非对称加密性能较低，在大量数据的传输情况下，这种情况是不理想的，那么这两个问题如何解决？

答案是协议”握手”阶段使用非对称加密加密，之后使用对称加密来加密数据。

客户端给服务端发送请求
服务端返回客户端自己的公钥
客户端产生本次对话的对称密钥 SK，并用公钥进行加密得到 SK后传给服务端
服务端收到 SK后用自己的私钥解密得到 SK；若成功，则返回客户端 OK，否则终止对话
接下来，客户端和服务端的对话均用 SK 加密后传输。

注：这只是一种广泛应用的加密方式，DH加密方式详细可以了解完全吃透 TLS/SSL

解决窃听、篡改、冒充风险

在 HTTPS 的握手阶段，一端向对端发送请求，对端返回自己的公钥；而一端未验证对方的身份和公钥，直接协商密钥。“中间人”看到了这个漏洞，夹在中间截获了对端的公钥，替换成了自己的公钥。正是这步“拿错了公钥”或者说“信错了对端”，使得 HTTPS 为加密（密钥协商的非对称加密和通信数据的对称加密）所做的努力全部泡汤。

这个时候就第三方权威认证机构来证明”服务器”是”可信”的。

那么如何证明服务器是可信的呢？

2020-03-24

OpenStack►Storage

记一次OpenStack功能测试，出现xfs数据不一致的问题

前言

工作中所负责的OpenStack集群由于在前期规划当中，热迁移的网卡默认走的是管理网络，而管理网络在集群规划时，是基于千兆网络组网的，这就导致如果出现大量热迁移的时候，千兆网络会成为限制。

通过修改live_migrate_inbound_addr选项，是可以修改通过哪个网卡进行迁移的。

冠状病毒给我放了一个大寒假，来的第一天我就准备把热迁移网卡换成存储的万兆网口网络，本来想着这个操作很简单，就不在测试集群上试了，直接在实际环境中操作吧，当时的想法如下：

寒假的第一天，蜜汁自信….
nova-compute服务自是管理虚拟机生命周期，短暂的重启nova-compute服务，并不会影响虚拟机，只要保证libvirtd服务的正常运行就行。

2020-01-03

Network

tcpdump笔记以及三次握手和四次关闭

参数笔记

-n 一个n表示不反向解析地址，两个n表示不反向解析端口。
-S 表示查看绝对序号，如客户端中的ack为1，这是相对的，要查看绝对序号需要加上该参数。
-v 表示查看详细的报文数据。
-r 该参数接收一个pcap文件的路径，不抓取数据包，而是去读取指定的pcap文件，并格式化输出。
-i 指定抓取的网卡，any表示监听全部。
-e 每行的打印输出中将包括数据包的数据链路层头部信息，我经常用它来看truck口的vlan id。
详情请参考: http://linux.51yip.com/search/tcpdump

语法编写

所有经过目的或原地址是192.168.1.1的数据包

1	$ tcpdump host 192.168.1.1

源端口或者是目的端口为80的数据包

1	$ tcpdump port 80

所有目的地址为192.168.1.1的数据包

1	$ tcpdump dst host 192.168.1.1

相同的，所有目的端口为80的数据包为dst port

所有的源地址为192.168.1.1的数据包

1	$ tcpdump src host 192.168.1.1

相同的，所有源端口为80的数据包为src port

协议过滤

tcpdump arp
tcpdump ip
tcpdump tcp
tcpdump udp
tcpdump icmp

2019-12-11

Docker►Container

Docker执行pull做了什么

Docker manifest

mediaType

在下面后面的文章中需要注意到mediaType对应的值，其代表了文件的类型，在规范中定义了以下类型：

application/vnd.docker.distribution.manifest.v1+json: 代表了第一个版本的manifest格式，已经出现了第二个版本了(schemaVersion = 1)
application/vnd.docker.distribution.manifest.v2+json: 代表了新版本的manifest格式 (schemaVersion = 2)
application/vnd.docker.distribution.manifest.list.v2+json: Manifest list，也就是上方示例的manifest list文件
application/vnd.docker.container.image.v1+json: Container config JSON
application/vnd.docker.image.rootfs.diff.tar.gzip: “Layer”, as a gzipped tar，代表了镜像层的压缩类型
application/vnd.docker.image.rootfs.foreign.diff.tar.gzip: “Layer”, as a gzipped tar that should never be pushed，
application/vnd.docker.plugin.v1+json: Plugin config JSON